Tuesday, September 5, 2023

2023 REPORT 8: AN ILLUSTRATION SHOWING WHY THE CURRENT NCAA RPI IS DEFECTIVE AND THE EFFECT OF THE DEFECT

"[T]he basic premise of the [RPI] formula remains true for any sport -- rankings are based on whom a team plays and whom it beats.  ... [A]lmost any discussion by schools, media or fans about why a team was or was not selected [as an at large participant in the NCAA Tournament] usually involve that team’s record or its strength of schedule."  (Emphasis added.)  From Frequently Asked Questions About the Women’s Soccer Rating Percentage Index, by Jim Wright (NCAA Director of Statistics at the time of the statement, who helped create the original RPI and worked for the NCAA Statistics Service for 30 years) and Rick Campbell (NCAA Assistant Director of Statistics at the time of the statement, who was on the NCAA statistics staff for 18 years and managed the women’s soccer RPI for six years) 

Consistent with the above statement, the current NCAA RPI measures two things:  a team’s Winning Percentage (its record) and its Strength of Schedule (who its opponents are).  It then combines the two with each having a 50% weight.

The way in which the NCAA measures the team’s Winning Percentage is straight forward:

(Wins + 0.5 x Ties)/(Wins + Ties + Losses)

The way in which it measures strength of schedule is less straight forward:

2 x Average of Opponents’ Winning Percentages + 1 x Average of Opponents’ Opponents’ Winning Percentages

Under this formula, the Opponents’ Winning Percentages element comprises roughly 80% of the effective weight of Strength of Schedule and the Opponents’ Opponents’ Winning Percentages element comprises roughly 20%.  For an explanation of why the 80-20 ratio, see RPI: Formula at the RPI for Division I Women’s Soccer website.

With Strength of Schedule comprising 50% of the effective weight of the overall RPI, the current NCAA RPI rating consists of 50% Winning Percentage, 40% Opponents’ Winning Percentages, and 10% Opponents’ Opponents’ Winning Percentages.  Teams are ranked accordingly.  On the other hand how a team is rated as a contributor to other teams’ Strengths of Schedule consists of 80% Winning Percentage and 20% Opponents’ Winning Percentages.  Teams are ranked as Strength of Schedule contributors accordingly.

In other words, a team has two different rankings:  (1) Its RPI rank and (2) its rank as a Strength of Schedule contributor.  These two ranks, since they are determined by quite different formulas, can be very different.  This is an RPI defect that has significant effects.  (NOTE: The NCAA does not publish teams’ ranks as Strength of Schedule contributors, so far as I know.)

To illustrate the effects of how the NCAA RPI calculates Strength of Schedule, I will use a real life case from the 2022 season involving Quinnipiac and Nebraska and the NCAA Tournament at large selections.

As a starting point, since 2007, the poorest ranked team to get an at large selection is #57.  Thus effectively, the candidate pool for at large selections is teams ranked #57 or better.  With that as context, here is a table for Quinnipiac and Nebraska:


In the table, ARPI 2015 BPs is the current NCAA RPI.  As you can see from the first highlighted column, Quinnipiac ended the season ranked #43 and Nebraska #58.  Thus Quinnipiac, by being rated in the Top 57, effectively pushed Nebraska just outside the candidate pool for at large selections.

The second highlighted column shows the average current NCAA RPI ranks of the teams’ opponents: 84 for Nebraska and 207 for Quinnipiac.  The third highlighted column shows the ranks under the current NCAA RPI of the teams’ opponents as Strength of Schedule contributors: 117 for Nebraska and 196 for Quinnipiac.  As I stated above, current NCAA RPI ranks can be very different than ranks as Strength of Schedule contributors and that is the case here.  For Nebraska, the RPI formula underranks its opponents’ strength by 33 positions on average.  For Quinnipiac, it overranks its opponents’ strength by 11 positions.

Here are the teams’ schedules, with the opponents’ current NCAA RPI ranks and their current NCAA RPI formula Strength of Schedule contributor ranks, to show this in more detail:

Nebraska


Quinnipiac


The NCAA requires the Women’s Soccer Committee to use the current NCAA RPI ranks in its at large selection process, thus inherently taking the position that those rankings are correct as compared to teams’ ranks as Strength of Schedule contributors.  This being the case, a necessary implication is that the RPI formula significantly underrates the strength of Nebraska’s opponents within its calculations and overrates the strength of Quinnipiac’s opponents.  In other words, the formula is biased against Nebraska and in favor of Quinnipiac.

Where would the two teams be in the rankings without those biases?  This is the question the Balanced RPI answers.  (The design of the Balanced RPI gets teams’ RPI ranks and their ranks as Strength of Schedule contributors so they match.)  You can find a full explanation of the Balanced RPI at the RPI: Modified RPI? page of the RPI for Division I Women’s Soccer website.

The following table matches the first table in this article, but using the Balanced RPI:


In the table, the URPI 50 50 SoS Iteration 15 is the Balanced RPI.  As you can see, when the differences between teams’ RPI ranks and their ranks as Strength of Schedule contributors are eliminated, we end up with very different rankings from the current NCAA RPI.  Now, Nebraska is #46 and a candidate for at large selection and Quinnipiac is #82 and not a candidate.

In fact, this does not matter for Quinnipiac, as it is an Automatic Qualifier.  But it does matter for Nebraska.  (It is important to note that the process of equalizing RPI rank and Strength of Schedule contributor ranks has moved a lot of other teams around too, some moving up in the rankings and some moving down.  When all the movement is over, this is where Nebraska and Quinnipiac end up.)

How much does it matter for Nebraska?  I have developed a scoring system for good results (wins or ties) against Top 50 opponents.  When I combine team ranks under that scoring system with their RPI ranks, each weighted at 50%, and then rank teams based on those combined factors, those rankings match the NCAA Tournament at large selections since 2007 for an average of all but 2 selections per year.  In other words, those rankings are a very good indicator of the Committee’s likely at large selections.  In the case of Nebraska and using the Balanced RPI, those rankings indicate it likely would have been an at large selection.

The bottom line is that because of the defects in the current NCAA RPI, teams like Quinnipiac that should be outside the Top 57 at large candidate pool are in the pool and teams like Nebraska that should be in the pool are outside it.  And, some of the teams that should be in the pool, such as Nebraska, likely would get at large positions.

Finally, as shown on the RPI: Modified RPI? linked page, in most cases the teams that should be outside the candidate pool but are in it due to the defects of the current NCAA RPI are from weaker conferences and/or regions and teams that should be inside the pool but are outside it under the current NCAA RPI are from stronger conferences and/or regions.  Thus Quinnipiac and Nebraska provide an excellent illustration of the current NCAA RPI’s defect and of the effects of the defect.

No comments:

Post a Comment