Thursday, July 31, 2025

2025 ARTICLE 11: 2025 PRE-SEASON PREDICTIONS AND INFORMATION, PART 4B, TEAMS' SCHEDULES IN RELATION TO OPPONENTS' NCAA RPI RANKS AND STRENGTH OF SCHEDULE RANKS

In Part 4, I discussed and showed the differences between teams' NCAA RPI ranks and their ranks as Strength of Schedule contributors under the NCAA RPI formula.  In this article, I will show predictions for how these differences will affect teams by the end of the 2025 season.

For each team, the following table shows its predicted:

Opponents' average NCAA RPI rank

Conference opponents' average NCAA RPI rank

Non-conference opponents' average NCAA RPI rank

Opponents' average rank as Strength of Schedule contributors under the NCAA RPI formula

Conference opponents' average rank as Strength of Schedule contributors under the NCAA RPI formula

Non-conference opponents' average rank as Strength of Schedule contributors under the NCAA RPI formula

These numbers allow you to see how the NCAA RPI rank versus Strength of Schedule contributor rank differences relate to:

1.  Teams' in-conference schedules, which teams basically can't control;

2.  Teams' non-conference schedules, which teams can control at least to some extent; and

3.  Teams' overall schedules.

If you review the table's numbers with a view to the strength of the teams' conferences, you will see that generally speaking the NCAA RPI formula understates the strengths of schedule of top tier conferences' teams, gets the strengths of schedule of middle tier conferences' teams about right, and overstates the strengths of schedule of bottom tier conferences' teams.  I've arranged the teams by conference so you can better see how this NCAA RPI defect affects teams by conference.  Scroll to the right, if necessary, to see the entire table.

NOTE: The differences in the Conference Opponents Average Rank column for teams from the same conference are primarily due to conference teams not playing full round robins.  The differences in the Non-Conference Opponents Average Rank column for teams from the same conference are due the different teams' non-conference scheduling strategies.

I'll use Baylor, from the Big 12, with a predicted NCAA RPI rank of #66, and Lamar, from the Southland, with a predicted NCAA RPI rank of #55, as examples.  I've chosen these teams because no team ranked poorer than #57 ever has gotten an at large position in the NCAA Tournament.  Thus Baylor is outside the historic at large candidate group and Lamar is within the candidate group.

Baylor (Big 12): 

Conference opponents' average NCAA RPI rank is 81 and conference opponents' average Strength of Schedule contributor rank under the NCAA RPI formula is 119.

Non-conference opponents' average NCAA RPI rank is 110 and non-conference opponents' average Strength of Schedule contributor rank under the NCAA RPI formula is 108.

Overall, opponents' average NCAA RPI rank is 93 and opponents' average Strength of Schedule contributor rank under the NCAA RPI formula is 115.

Thus Baylor's Strength of Schedule component of the NCAA RPI significantly discriminates against Baylor in relation to its conference schedule and only barely offsets that discrimination in relation to its non-conference schedule.  The overall result is that the Strength of Schedule component significantly discriminates against Baylor.

Lamar (Southland):

Conference opponents' average NCAA RPI rank is 212 and conference opponents' average Strength of Schedule contributor rank under the NCAA RPI formula is 185.

Non-conference opponents' average NCAA RPI rank is 153 and non-conference opponents' average Strength of Schedule contributor rank under the NCAA RPI formula is 184.

Overall, opponents' average NCAA RPI rank is 193 and opponents' average Strength of Schedule contributor rank under the NCAA RPI formula is 185.

Thus the Strength of Schedule component of the NCAA RPI significantly discriminates in favor of Lamar in relation to its conference schedule and offsets that discrimination some in relation to its non-conference schedule.  The overall effect, however, is that the Strength of Schedule component still discriminates in favor of Lamar.

Given that Baylor is outside but in the vicinity of the ranking area of teams that historically are candidates for NCAA Tournament at large selections and Lamar is only a little inside that ranking area, this demonstrates the importance of this NCAA RPI defect.  History suggests that Lamar, if not an Automatic Qualifier, would not get an at large selection.  For Baylor, however, being outside the historic candidate area, there is a question whether, if inside the candidate area and considered by the Committee, it might displace one of the "last in" at large teams.  In other words, this NCAA RPI defect may have negative NCAA Tournament at large selection consequences.  (And, by a similar analysis of seeding candidate groups, may have negative seeding consequences.)

The significance of this kind of example is reinforced if you consider Lamar's and Baylor's ranks using my Balanced RPI.  The Balanced RPI is a rating system that builds on the RPI, with modifications that fix the NCAA RPI's defective discrepancy between teams' NCAA RPI ranks and their ranks as Strength of Schedule contributors under the NCAA RPI formula.  The Balanced RPI's predicted rank for Lamar is 110, well outisde the NCAA Tournament at large selection candidate range.  For Baylor, its predicted rank is #57, in other words a candidate for at large selection.





2025 ARTICLE 10: 2025 PRE-SEASON PREDICTIONS AND INFORMATION, PART 4, TEAMS' NCAA RPI RANKS COMPARED TO THEIR RANKS AS STRENGTH OF SCHEDULE CONTRIBUTORS

The NCAA RPI has a major defect, which is the way in which it computes a team's strength of schedule.

As discussed on the RPI: Formula page at the RPI for Division I Women's Soccer website, the NCAA RPI has two main components:  a team's Winning Percentage and its Strength of Schedule.  Within the overall NCAA RPI formula, the effective weights of the two components are approximately 50% Winning Percentaqge and 50% Strength of Schedule.

Within the NCAA RPI formula, in turn, Strength of Schedule consists of two elements: the average of a team's opponents' winning percentages (OWP) and the average of a team's opponents' opponents' winning percentages (OOWP).  And, within Strength of Schedule, the effective weights of these two elements are 80% opponents' winning percentage and 20% opponents' opponents' winning percentage.  Thus for the NCAA RPI's Strength of Schedule component, a team's opponents' winning percentages matter a lot and against whom they achieved those winning percentages matters little.  This is a major defect.

In this and the next two parts of my Pre-Season Predictions and Information, using end-of-season predictions for the 2025 season, I will show how the NCAA RPI's strength of schedule defect plays out for teams (this Part 4), for conferences (Part 5), and for geographic regions (Part 6).

The following table shows, for each team, its predicted end-of-season NCAA RPI rank and its predicted rank as a strength of schedule contributor under the NCAA RPI formula.  In a good rating system, these ranks should be the same or, at least, very close to the same.  As the table shows, however, for the NCAA RPI formula, for many teams, the ranks are not close to the same.

Using some of the top teams in the alphabetical list as examples:

If Team A plays Air Force as an opponent, Team A will have played the NCAA RPI #232 ranked team.  When computing Team A's rating and rank, however, the NCAA RPI formula will give team A credit only for playing the #274 team.

On the other hand, if Team A plays Alabama State, Team A will have played the #340 team.  But when computing Team A's rating and rank, the NCAA RPI formula will give Team A credit for playing the #277 team.

Thus although the NCAA RPI ranks Air Force and Alabama State 108 rank positions apart, when considering each of their strengths for purposes of Team A's strength of schedule computation, the NCAA RPI treats Air Force and Alabama State as roughly equal.

You can scroll down the table and see how this NCAA RPI formula defect plays out for teams you are interested in,  I suggest you look, in particular, at teams in the middle to lower levels of top tier conferences and at teams in the upper levels of middle and bottom tier conferences.  For example:

Look at Alabama:  Its predicted NCAA RPI rank is #37.  But, its predicted rank as a strength of schedule is only #89.

Then look at Bowling Green:  Its predicted NCAA RPI rank is #86 but its predicted rank as a strength of schedule contributor is #26.

These kinds of differences have significant practical implications related to scheduling and the NCAA Tournament.  Teams' NCAA RPI ranks are a key factor in the Women's Soccer Committee's decisions on Tournament seeds and at large selections.  So, if a coach has NCAA Tournament aspirations, from strictly an NCAA RPI perspective, Bowling Green would be a significantly better opponent to play than Alabama.  This would be true for two reasons: (1)  Bowling Green probably is weaker than Alabama, so an easier game in which to get a good result; and (2) Bowling Green, as an opponent, will give the coach's team's NCAA RPI a better strength of schedule contribution than Alabama.

Thus when doing non-conference scheduling, coaches with NCAA Tournament aspirations or with other concerns about where their teams will finish in the NCAA RPI rankings must take this NCAA RPI formula defect into account.  In essence, they are in the position of having to learn how to "trick" the NCAA RPI through smart scheduling -- in the example, choosing Bowling Green rather than Alabama as an opponent. 




Wednesday, July 30, 2025

2025 ARTICLE 9: 2025 PRE-SEASON PREDICTIONS AND INFORMATION, PART 3, "PREDICTED" CONFERENCE REGULAR SEASON AND TOURNAMENT CHAMPIONS

Continuing with "predictions," using the "results likelihood" method described in 2025 Article 9, my system uses the same "3 points for a win and 1 for a tie" scoring that conferences use for their standings to create team standings within each conference.  It is worth noting that the results likelihoods take game locations into account and that a good number of conferences do not play full round robins.

Using the ACC as an example, here are what its "predicted" end-of-season standings look like:


Although Florida State's and North Carolina's points look the same, that is due to rounding.  Florida State's are slightly higher.  My interpretation of these standings is that it will be very close at the top of the conference among Florida State, North Carolinam Stanford, and Duke, with Virginia also in the mix.

Using the conference standings and the conference tournament formats (as published to date), my system next creates conference tournament brackets.  Then, since it is necessary to have winners and losers to fill out the entire tournament brackets, the system assigns as a game winner any team that has a win likelihood above 50%.  Where neither team has a win likelihood above 50%, the system treats the game as a tie.  For the tiebreaker, the advancing team is the one with the higher win likelihood.

This process results in the following conference regular season and conference tournament champions.  In most cases they are the same, but in two cases they are different.




Monday, July 28, 2025

2025 ARTICLE 8: 2025 PRE-SEASON PREDICTIONS AND INFORMATION, PART 2, "PREDICTED" END-OF-SEASON RANKS AND RATINGS

Once I have assigned pre-season strength ranks and ratings to teams, I combine those with teams' schedules to "predict" where teams will end up at the end of the season following completion of the conference tournaments.  The process to do this requires background work:

1.  For each game, I calculate the game-location-adjusted rating difference between the teams, using their assigned pre-season strength ratings as the base.  Since the assigned strength ratings are based on what the average historic NCAA RPI ratings are for those teams' ranks, my game location adjustments increase the home team's rating by 0.0085 and decrease the away team's rating by 0.0085, for an overall adjustment of 0..0170.  This is the value of home field advantage for the current version of the NCAA RPI with the "no overtime" rule in effect.

2.  For the location-adjusted rating difference between the teams, I calculate each team's expected win, loss, and tie likelihoods.  These likelihoods are based on a study of the location-adjusted rating differences and results of all games played since 2010 (excluding 2020).

[NOTE: For a detailed explanation of how I determine the game location adjustment and the win, loss, and tie likelihoods, go to the RPI for Division I Women's Soccer website's page RPI: Measuring the Correlation Between Teams' Ratings and Their Performance

3.  Rather than assigning the opponents in a game either a win, a loss, or a tie result, I assign each team its win, loss, and tie likelihoods since these will give a better picture of what a team's overall record will be given its entire schedule.  As an example, Colorado and Michigan State will play on August 14 at Colorado.  Their location-adjusted rating difference is 0.0333 in favor of Colorado.  For that rating difference, referring to a result probability table for the current NCAA RPI in a "no overtime" world, Colorado's result likelihoods are 53.6% win, 19.9% loss, 26.4% tie (which don't quite add up to 100% due to rounding).  Those numbers go into Colorado's win-loss-tie columns for NCAA RPI computation purposes.  Michigan State's win-loss percentages are the converse.  I assign these likelihoods for all teams' games, add up each team's percentages, and convert them from percentages to numbers .  Thus Colorado ends the season with 12.2 wins, 4.9 losses, and 4.9 ties.  These are the numbers I use for Colorado as its wins, losses, and ties when computing its NCAA RPI.

4.  I then compute all teams' NCAA RPI ratings and ranks.  It is important to understand that these are different than the teams' assigned pre-season strength ratings and ranks.  This is because the NCAA RPI does not measure team strength.  Rather, it measures team performance based on a combination of teams' winning percentages and their strengths of schedule (as measured by the NCAA RPI formula).  Thus two teams with identical strength ratings and ranks will end up with different NCAA RPI ratings and ranks if they have different winning percentages and/or different strengths of schedule.  Below are my computed end-of-season (including predicted conference tournaments) ratings and ranks for teams.  You can compare the ranks to the ones in the preceding post to see the differences between teams' assigned pre-season strength ranks and team' predicted end-of-season NCAA RPI ranks.  (NOTE: I have corrected these rankings since their initial publication to fix a programming error.  The changes are relatively minor.)





2025 ARTICLE 7: 2025 PRE-SEASON PREDICTIONS AND INFORMATION, PART 1, ASSIGNED PRE-SEASON RANKS AND RATINGS

INTRODUCTION

All the teams have published their schedules -- almost, with Akron yet to publish its non-conference schedule and Delaware State yet to publish, but unless those teams play each other, we can determine their schedules from what others have publlished.  So for practical purposes we have the all the schedules.  This makes it possible to do pre-season predictions of where teams will end up at the end of the regular season, including conference tournaments.

Pre-season predictions involve a lot of assumptions that may or may not prove correct.  Also, my prediction method depends entirely on teams' rank histories.  So don't take the pre-season predictions too seriously.  On the other hand, this series of articles will have educatational value, particularly about how the NCAA RPI works and about the importance of scheduling.  So I recommend not getting preoccupied with the details of the predictions but instead watch for what you can learn about the NCAA RPI and its interaction with teams' schedules.

ASSIGNED PRE-SEASON RANKS AND RATINGS

The first step in my process, which is the subject of this article, is to assign pre-season ratings and ranks to teams.  In essence, this is predicting teams' strength.

There are a lot of ways to predict team strength, some complex and some simple.  I predict strength using only teams' rank histories, without reference to changing player and coaching personnel.  There are others who do predictions using those kinds of detailed information about this year's teams -- conference coaches for their own conferences and in the past, but not currently, stats superstar Chris Henderson.  In the past, their predictions have been better than mine, but only slightly better.  So you can consider my predictions as somewhat crude but close to as good as you can get when making data-based predictions for the entire cast of teams.

When using only teams' rank histories, my analyses show that the best predictor of where teams will end up next year is the average of their last 7 years' ranks using my Balanced RPI, which is a modified version of the NCAA RPI that fixes its major defects.  There is a problem, however, with that predictor.  If a team has a major outlier year -- a much higher or lower rank than is typical -- using a 7-year average can significantly mis-rate that team.  On the other hand, if I use the median rating over the last 7 years, it avoids that problem.  It is a little less accurate as a predictor for where teams will end up next year, but not by much.  So for my predictor, I use teams' median Balanced RPI ranks over the last 7 years.

Once I have assigned teams' ranks, I then assign NCAA RPI ratings to the teams.  To do this, I have determined the historic average rating of teams at each rank level.  When I do this, however, I have to take into account the recent NCAA "no overtime" rule change and the 2024 NCAA RPI formula changes.  So when I determine historic average ratings, I use what past ratings would have been if the "no overtime" rule and 2024 NCAA RPI formula had been in effect, using the years 2010 to the present (but excluding Covid-affected 2020).

This produces the following "strength" ranks and ratings for teams.  You will note that no team has a #1 assigned rank and some teams have the same assigned rank.  This is because I am using teams' 7-year median ranks.  (Scroll to the right to see additional teams.)