Sunday, December 27, 2020

NCAA TOURNAMENT: TEAM PROFILES AND AT LARGE SELECTIONS

Team Profiles.  The Women’s Soccer Committee has a number of factors the NCAA requires it to use to pick the at large participants in the NCAA Tournament.  I break these down into 13 factors:

    RPI Rating

    RPI Rank

    Non-Conference RPI Rating

    Non-Conference RPI Rank

    Results against teams already selected for the bracket, including automatic            qualifiers ranked #75 or better.  Based on my observations of Committee               decisions over time, I use a surrogate for this factor: Results Against Top 50            Opponents

    Results Against Top 50 Opponents Rank

    Head to Head Results, for which I use the surrogate of Head to Head Results            Against Top 60 Opponents

    Results Against Common Opponents, for which I use the surrogate of Results        Against Common Opponents with Top 60 Teams

    Results Against Common Opponents Rank

    Conference Regular Season Standing and Conference Tournament Results, for        which I use the surrogate of Conference Standing (Combined)

    Conference Average RPI

    Conference Average RPI Rank

    Results Over the Last 8 Gmes, for which I use the surrogate of Poor Results

The NCAA has a scoring system for some of these factors, such as the RPI.  For other factors it does not: Results Against Top 50 Opponents, Head to Head Results Against Top 60 Opponents, Results Against Common Opponents with Top 60 Teams, Conference Standing (Combined), Poor Results.  For each of these, I have my own scoring system.

Together, teams’ scores for the above factors make up a their profiles.

The Results Against Top 50 Opponents factor is important and relates to scheduling, so it is worth describing the scoring system.  In looking at the Committee’s decisions among bubble teams, it appears that a few good results against highly ranked teams count a whole lot more than a greater number of good results against moderately ranked teams.  Further, good results against teams ranked below #50 appear not helpful at all (apart from their influence on teams’ RPI ratings and ranks).  This suggests that the Committee asks (whether consciously or not), "At how high a level have you shown you can compete?" and tends to select teams based on the answer to that question.  With that in mind, I developed this scoring system for Results Against Top 50 Opponents:


As you can see, this scoring system is very highly skewed towards good results against very highly ranked opponents.

In addition to the 13 individual factors, I use paired factors.  A paired factor puts two of the individual factors together using a formula that weights the two factors equally.  I do this because of how I have imagined a Committee member might think:  Yes, Team A’s RPI Rank is poorer than Team B’s, but when you look at their Results Against Top 50 Opponents, Team A’s are better, and when you look at the two factors together, Team A looks better overall. 

I pair each individual factor with each other individual factor.  After doing this, I end up with 78 paired factors, which when added to the 13 individual factors gives me a total of 91 factors.

Team Profiles and the Committee’s At Large Selections.  Using data from the last 13 years, including team factor and Committee at large selection data, for each factor there are two key questions:

1.  Is there a factor score (a "yes" standard) where, for a team with that score or better, the team always has gotten an at large selection?

2.  Is there a factor score (a "no" standard) where, for a team with that score or poorer, the team never has gotten an at large selection?

Using RPI Rank as an example factor, teams with RPI Ranks of #30 or better always have gotten at large selections.  And, teams with RPI Ranks of #58 or poorer never have gotten at large selections.  Thus the RPI Rank "yes" standard is 30 and the "no" standard is 58.

Asking these two questions for each of the 91 factors produces a "yes" and a "no" at large selection standard for most of them.

At the end of a season, I then can apply the standards to the end-of-season data and, based on which teams meet "yes" and "no" at large selection standards, project what the Committee will decide if it follows its historic patterns.  (And, after the season is over and if the Committee has made decisions that do not match the standards, I can revise the standards to be sure they are consistent with all past Committee decisions.)

When I apply this process for a season that just has ended, I end up with some teams that meet only "yes" at large standards, some that meet only "no" standards, a few that meet some "yes" and some "no" standards (which means they have profiles the Committee has not seen in the past), and some that meet no "yes" or "no" standards.  So far, every year there have not been enough teams that meet only "yes" standards to fill all the at large slots, so there always have been some open slots -- ranging from 2 to 8 open slots since 2007.  In my system, the teams that meet only "no" standards are out of the picture.  Thus the remaing open slots are to be filled by the teams that meet no "yes" or "no" standards or that meet some of each.

How well does this process work?  I run tests to answer this question, applying the current standards retroactively to each of the 13 years in my data base.  This tells me: If I had had the standards at the beginning of the 2007 season and had applied them to each year’s end-of-season data, how many of the teams actually getting at large selections would the standards have picked?  Here is the answer:

Since 2007, there have been 435 at large positions to fill.  My standards, if applied to the Top 60 teams in each of those seasons (that were not Automatic Qualifiers), would have filled 374 of the 435 positions.  This would have left 61 postions to fill, with 90 candidate teams to fill them (each of which met no "yes" and no "no" standards).  This amounts to a little under 5 positions per year that the standards by themselves cannot fill with a pool of roughly 7 teams from which to fill them.

The next question is: Which of the factors was the most powerful at correctly identifying at large selections from among the Top 60 teams that were not Automatic Qualifiers.  As it turns out, these are the most powerful, in order:

RPI Rank and Top 50 Results Score (factor pair): correctly identified 313 at large selections

RPI Rank and Conference Rank: 301

RPI Rank and Top 50 Results Rank: 274

RPI Rank: 270

RPI Rating and Conference Rank: 255

RPI Rank and Common Opponents Rating: 253

RPI Rank and Conference ARPI: 251

RPI Rank and Poor Results: 249 

RPI Rating and Common Opponents Rating: 248

RPI Rating: 245

RPI Rank and Common Opponents Rank: 244

RPI Rating and Common Opponents Rank: 236

RPI Rating and Top 50 Results Rank: 212

RPI Rating and Top 50 Results Score: 207

After this is a big drop in the power of factors

As stated above, after applying all of the factor standards to the profiles of each year’s Top 60 teams that were not Automatic Qualifiers, in order to fill at large positions via the standards, I was left with 61 at large possitions to fill over the 13 years for which I have data and 90 candidates from which to fill them.  I then took each of the above factors, as well as some others that are powerful for seeds but not so much for at large selections, and asked: From the candidate teams each year, what if I gave the remaining at large selections to the ones scoring the best on this factor?  How many correct selections would this factor make?  Here are the results:

RPI Rank and Top 50 Results Rank: 47 correct selections (for 61 positions)

RPI Rating and Top 50 Results Rank: 46

RPI Rank and Conference Rank: 46

RPI Rating and Conference Rank: 46

RPI Rank and Conference RPI Rating: 45

RPI Rating and Common Opponents Rating: 45

RPI Rating and Common Opponents Rank: 45

RPI Rank and Common Opponents Rating: 45

RPI Rank and Common Opponents Rank: 45

Head to Head Results: 45

Finally, I have asked one more question: What if I use only one factor to pick all of the at large teams (rather than using the factor standards system)?  How would that compare to the Committee’s actual selections?  When I run that test, here are the results:

RPI Rank and Top 50 Results Rank: correctly picks 408 of the Committee’s 435 at large selections

RPI Rating and Top 50 Results Score: 406

RPI Rank and Top 50 Results Score: 405

RPI Rank and Conference Rank: 405

RPI Rank and Conference RPI: 403

RPI Rating: 402

RPI Rank: 401

RPI Rating and Top 50 Results Rank: 399

RPI Rank and Poor Results: 397 

RPI Rank and Common Opponents Results: 394 

In other words, the RPI Rank and Top 50 Results Rank factor correctly matches 408 out of the 435 at large selections the Committee has made over the last 13 years, or all but roughly 2 per year.

A way to think about this is that if a team is in the Top 60 and scores well on the RPI Rank and Top 50 Results Rank factor, then the rest of its profile necessarily is going to be very good.  Thus whatever the Committee members think about and discuss, they are highly likely to make at large selections as if they consider teams’ RPIs and their Top 50 Results, paired together, as the most important -- indeed almost decisive -- aspect of their profiles.

No comments:

Post a Comment