Tuesday, November 25, 2025

2025 ARTICLE 31: WHAT IF THE COMMITTEE USED THE BALANCED RPI RATHER THAN THE NCAA RPI? DIGGING DEEP

In this article, I'll show what the Women's Soccer Committee's seeding and at large selection decisions likely would have looked like if the Committee had used the Balanced RPI and will compare them to the Committee's actual decisions.  After doing that, I will discuss in detail why there are differences.

Decisions with the Balanced RPI as Compared to the Committee's Actual Decisions

The table below shows the Committee's actual seeding and at large selections as compared to what they likely would have been using the Balanced RPI.  The Committee does not always do what is "likely," but they come close, so this should give a good picture of the differences between the two rating systems.  At the top ot the table is a key to the table's two right-hand columns.  In the table's left-hand column, the green highlighting is for teams that would get at large selections using the Balanced RPI but that did not actually get them with the Committee using the NCAA RPI.  The orange highlighting is for teams that would not get at large selections using the Balanced RPI but that actually got them.  The lime highlighting is for teams that would have been candidates (Top 57) for at large selections using the Balanced RPI but would not have been selected, but that were not even candidates using the NCAA RPI.  The salmon highlighting is for teams that were not candidates for at large selections using the Balanced RPI but that were candidates that did not get selected using the NCAA RPI.




Why Are There Differences?  Digging Down a Level

Think of a rating system as a tree.  Its ratings and rankings of teams are what you see above ground.  Where the ratings and ranks come from is the tree's underground root structure.  As you dig down and expose the root structure, you get a better and better understanding of where what you see above ground comes from.  I'll use this tree metaphor to show what the differences are between the NCAA RPI and the Balanced RPI and why they are different.

The first underground level is best described by looking at actual results as compared to  "expected" results.  Specifically, which teams, conferences, and regions have done better (actual results) than their ratings say they should have done (expected results) and which have done more poorly?

Expected results for a particular rating system come from a history-based result probability table for that system.  The table shows teams' win, loss, and tie probabilities for different rating differences between teams and their opponents (adjusted for home field advantage).  When applied to large numbers of games, the result probability tables are very accurate.  For example, applying the NCAA RPI result probability table to all games from 2010 through 2024, here is how games' higher rated teams' expected results compare to their actual results:


When applying the result probability table to a single team for a single season, thus dealing with relatively few games, one would not expect the level of equivalence between actual results and expected results that is shown in the above table.  For the 2025 season, a look at individual teams' expected results based on their NCAA RPI ratings compared to their actual results yields the following table:


This table shows that the team whose actual winning percentage was most above its expected winning percentage was 9.0% above.  The team whose actual winning percentage was most below its expected winning percentage was -11.2% below.  The sum of these two numbers, 20.2%, is an indicator of how well the NCAA RPI measured teams' performance.

Here is a similar table, but looking at conferences, in non-conference games:


The different states' teams play a majority or plurality of their games within one of four geographic regions.  The next table is similar to the teams and conferences tables, but looking at the four regions, in non-region games:


Here are similar tables for the Balanced RPI:







As you can see, in going through the levels -- from teams, to conferences, to regions -- the Balanced RPI's expected results are closer to actual results than for the NCAA RPI.  Using the tree metaphor, this underground difference between the two systems accounts for part of the difference you see between the Committee's NCAA RPI-based seeding and at large selections and what the seeds and selections likely would have been using the Balanced RPI.

Why Are There Differences?  Digging Down a Second Level

In sports, and perhaps especially for soccer, no system can produce ratings that are 100% consistent with results.  Given that there will be inconsistencies between results and ratings, a critical question is whether the inconsistencies are random (desirable) or whether they follow patterns that discriminate against and/or in favor of identifiable groups of teams (not desirable).

A good way to answer the "random or discriminatory" question is to look at the teams that are candidates for at large selections.  Historically, all at large teams have come from the Top 57 in the RPI rankings, so that is the candidate group.  In the 2025 shift from the NCAA RPI to the Balanced RPI, there is a change of 10 teams in the Top 57:

The teams dropping out of the Top 57 as a result of a shift to the Balanced RPI are, in order of NCAA RPI rank: Fairfield (AQ), Samford (AQ), Rhode Island, Charlotte, James Madison, Old Dominion, Army (AQ), Lipscomb (AQ), UNC Wilmington, and Texas State (AQ).

The teams moving into the Top 57 as a result of the shift are, again in order of NCAA RPI rank: Cal State Fullerton, Pepperdine (AQ), Kansas State, Seattle, Arizona State, Southern California, Portland, Houston, Santa Clara, and Nebraska.

The following table provides data related to why these changes occur:


For each team that is "in" or "out" of the Top 57 in a shift to the Balanced RPI, the table shows, in the five columns on the right, how teams actual winning percentages compare with their expected winning percentages.  It shows this for the NCAA RPI ratings and for the Balanced RPI ratings.

In the table, the 7th and 8th columns show the teams' actual winning percentages as compared to their NCAA RPI ratings' expected winning percentages.  The 9th column shows the actual versus expected differences for the teams.  A positive difference means a team's actual results have been better than its expected results.  A negative difference means actual results have been poorer than expected results. At the bottom of the 9th column, you can see the average differences for the "in" teams and for the "out" teams.  The "in" teams' actual results averaged 2.3% better than their expected results; and the "out" teams' actual results averaged 3.0% poorer than their expected results.  In other words, on average the NCAA RPI underrated the "in" teams and overrated the "out" teams, with a cumulative 5.3% (2.3% + 3.0%) discriminatory effect against the "in" teams relative to the "out" teams.

On the other hand, moving to the 11th column, for the Balanced RPI, the "in" teams' actual results averaged 0.2% better than their expected results; and the "out" teams' actual results averaged 1.0% better.  This amounts to a slight discriminatory effect against the "out" teams of 0.9% (1.0% - 0.2%, rounded off) relative to the "in" teams.

Continuing with the tree metaphor, this underground difference between the NCAA RPI and the Balanced RPI -- the NCAA RPI's discrimination against the "in" teams and in favor of the "out" teams and the Balanced RPI's near elimination of discrimination between the "in" and "out" teams -- accounts for another part of the difference you see between the Committee's NCAA RPI-based at large selections and what the selections likely would have been using the Balanced RPI.

This tells us that the Balanced RPI mostly eliminates the NCAA RPI's discriminatory effects.  But, it does not tell us why.

 Why Are There Differences?  Digging Down Another Level

The NCAA RPI and the Balanced RPI have two key components:  a team's Winning Percentage (WP) and the team's Strength of Schedule (SoS).  Within each system's formula, each component has a 50% effective weight.

A team's SoS is intended to measure its opponents' strengths.  The two systems measure opponents' strengths differently.

For each rating system, it is possible to calculate a team's rating as an SoS contributor to its opponents -- as distinguished from its actual RPI rating.  It then is possible to determine a team's rank as an SoS contributor to its opponennts.

A team's rank as an SoS contributor should be the same as its RPI rank.  For the NCAA RPI, however, it isn't.  In fact, for the NCAA RPI, the average difference between a team's RPI rank and its rank as an SoS contributor is 31.3 rank positions, with the median difference 24 positions.  I designed the Balanced RPI, using the RPI as a starting point, to eliminate this disparity between RPI ranks and SoS contributor ranks.  As a result, for the Balanced RPI, the average difference between a team's RPI rank and its rank as an SoS contributor is 0.8 rank positions, with the median difference 0 positions.  In simple terms, for the NCAA RPI there are significant differences between a team's NCAA RPI rank and its rank as an SoS contributor to its opponents; but for the Balanced RPI the two essentially are the same..

For the "in" and "out" teams, the following table shows what I described in the preceding paragraph:


In the table, start out by looking at the three columns Opponents NCAA RPI Average Rank, Opponents NCAA RPI Average Rank as SoS Contributor,  and NCAA RPI Difference.  In the Difference column, a negative number means that the team's opponents' average rank as SoS contributors is poorer than their opponents' average actual NCAA RPI rank.  In other words, the NCAA RPI formula understates the team's SoS -- it discriminates against the team.  A positive number means that the team's opponents' average rank as SoS contributors is better than its opponents' average actual NCAA RPI rank.  In other words, the NCAA RPI formula overstates the team's SoS -- it discriminates in favor of the team.

At the bottom of the table, in the NCAA RPI Difference column, you can see the average differences for the "in" and "out" teams.  The average difference for the "in" teams is -27 and for the "out" teams is -7.  For the "out" teams, this means the NCAA RPI discriminates against them some.  But for the "in" teams, the NCAA RPI discriminates against them almost four times as much as for the "out" teams.

The last three columns on the right are similar, but for the Balanced RPI.  There, for the "in" teams, the average difference is -1 and for the "out" teams it is 1.  In other words, using the Balanced RPI there is virtually no discrimination between the "in" and "out" teams.

For the NCAA RPI, this high level of discrimination against the "in" teams explains why the "in" teams' average performance is better than their ratings say it should be, including in relation to the "out" teams even though the "out" teams experience some discrimination.  And for the Balanced RPI, the lack of discrimination explains why the "in" and "out" teams' performance is close to what their ratings say it should be.

Why Are There Differences?  Digging Down One More Level

There is more to see, however, in the preceding table.  If you focus on the conferences and regions of the "in" and "out" teams, you will see patterns.  Most of the "in" teams are from the West region and those not from the West are from the Power 4 conferences.  All of the "out" teams are from mid-major conferences and from the North and South regions.  Why are we seeing these patterns?

The following table, shows conferences' actual winning percentages in non-conference games as compared to their NCAA RPI expected winning percentages, with the conferences in order from those most discriminated against at the top to those most discriminated for at the bottom:


As you can see, stronger conferences and conferences from the West tend to be in the upper part of the table - the most discriminated against conferences.  Compare this with the following table for the Balanced RPI:


In the Balanced RPI table, there still are differences between conferences' actual performance and their expected performance.  But the differences are less tied to conference strength and geographic regions than in the NCAA RPI table (as well as overall being smaller).

What underlies the above tables for the NCAA RPI and the Balanced RPI?  The following table shows, for conferences, the differences between conference teams' opponents' average NCAA RPI ranks, conference teams' opponents average NCAA RPI ranks as strength of schedule contributorts, and the difference between the two.  As above, a negative difference means the NCAA RPI on average discriminates against the conference's teams; and a positive difference means it discriminates in favor of the conference's teams.  In the table, the most discriminated against teams are at the top and the most discriminated in favor of teams are at the bottom.


In this table, the key columns are the Conferences NCAA RPI Rank and the Conference Teams Opponents NCAA RPI Ranks Less NCAA RPI SoS Contributor Ranks Difference columns.  As you can see, the NCAA's way of calculating SoS discriminates heavily against teams from stronger conferences and in favor of teams from weaker conferences.

Compare this to the similar table for the Balanced RPI:


Here, the conferences are in the same order is in the preceding table.  You can see that for the Balanced RPI, conference teams' opponents' ranks and their ranks as SoS contributors are essentially the same for all conferences.  This is one of the underlying causes for the "in" and "out" changes when shifting from the NCAA RPI to the Balanced RPI.

What about for regions?

Here is the NCAA RPI's actual versus expected performance table for regions, in non-region games:


As you can see, the NCAA RPI discriminates significantly against the West region (and in favor of the North).  Compare this to the table for the Balanced RPI:


As you can see, the Balanced RPI minimizes discrimination in relation to regions.

Here is the underlying table for the NCAA RPI, showing regions' teams' average RPI ranks as compared to their average ranks as SoS contributors:



As you can see, the numbers in the Difference column are in order or region strength.  The NCAA RPI's discrimination in how it values conference teams' strengths of schedule exactly matches region strength.  The stronger the region, the more the discrimination.

Here is the table for the Balanced RPI:


Here, the regions are in the same order is in the preceding table.  You can see that for the Balanced RPI, region teams' opponents' ranks and their ranks as SoS contributors are essentially the same.  This is another of the underlying causes for the "in" and "out" changes when shifting from the NCAA RPI to the Balanced RPI.

Summarizing all of the above information, the reason for the changes in "in" and "out" teams when shifting from the NCAA RPI to the Balanced RPI is (1) the Balanced RPI's ratings of teams correspond better with teams' actual performance and (2) the Balanced RPI eliminates the NCAA RPI's discrimination among conference and regions.

Why Are There Differences?  Digging Down to the Third Level

Why does the NCAA RPI have large differences between RPI ranks and ranks as SoS contributors?  Continuing with the tree metaphor, it is due to the NCAA RPI's DNA, the RPI formula itself.

A team's RPI rating is a combination of the team's Winning Percentage (WP), its Opponents' Winning Percentages (OWP), and its Opponents' Opponents' Winning Percentages (OOWP).  The way the formula combines the three, WP has an effective weight of 50%, OWP has an effective weight of 40%, and OOWP has an effective weight of 10%.

A team's opponents' contributions to its RPI rating are their winning percentages (OWP) and their opponents' winning percentages (OOWP), which as just stated account for 40% and 10% respectively of the team's RPI rating.  Thus an opponent's contribution, if isolated, is 80% the opponent's WP and 20% the opponent's OWP.

Since a team's NCAA RPI rating is 50% its WP, 40% its OWP, and 10% its OOWP, but the team's SoS contribution to an opponent is 80% the team's WP and 20% the team's OWP, it is no wonder there are significant differences between teams'  NCAA RPI ranks and their ranks as SoS contributors.

These differences between a team's NCAA RPI rating and its SoS contribution to an opponent are the DNA that is the source of the NCAA RPI patterns described above.

The Balanced RPI, on the other hand, starts with a structure similar to the NCAA RPI, although with an effective weights of 50% WP, 25% OWP, and 25% OOWP and, within WP, with a tie counting as half of a win rather than a third of a win as in the NCAA RPI.  The Balanced RPI formula then goes through a series of additional calculations whose effect is have each team's RPI rank and rank as an SoS contributor be the same.  This more complex formula is the source of the Balanced RPI patterns described above.

CONCLUSION 

The differences between the Committee's actual NCAA Tournament seeding and at large decisions and what those decisions likely would have been using the Balanced RPI are not simply a matter of differences between two equal rating systems.

(1) The Balanced RPI's ratings are more consistent with actual game results than the NCAA RPI's; and (2) The Balanced RPI has minimal to no discrimination among conferences and regions whereas the NCAA RPI has significant discrimination.  These differences between the NCAA RPI and the Balanced RPI account for the bracket differences at the top of this article.


No comments:

Post a Comment