In the preceding post, I described two tests I created for evaluating whether the RPI will be usable for the 2020-21 NCAA Tournament at large selections. In this post, I will show how those tests apply to simulated end-of-season RPI rankings based on the full season schedule as of February 19.
I will not go into a full discussion here of how I do simulated rankings, but here is a brief outline of how I do it:
1. For each team, I do a statistical analysis of its rank trend over the time since 2007 and also since a year before the coach arrived, if the coach arrived after 2007. Based on this, I assign the team a simulated rank and rating for this year.
2. Using the full season calendar for this year, for each game I use the opponents’ simulated ratings, as adjusted for home field advantage, to determine a simulated game result of win-loss, or tie. When I do this, if a team’s location-adjusted rating advantage over its opponent is big enough that its win likelihood statistically is over 50 percent, I treat the better rated team as winning. (This is different than real life, where a team that statistically should win sometimes ties and sometimes loses.)
3. After determining all of the simulated game results, I apply the RPI formula to the results, to calculate simulated RPI ratings and ranks for all teams.
For this year, here are the Top 60 teams in the simulated rankings I developed in Step 1 above. In a normal year, I do not expect the final actual rankings to match these. In most cases, team simulated rankings will be in the rough vicinity of the final actual rankings. There will be some teams, however, that will have final actual rankings significantly different than their simulated rankings.
This table shows that for the numbers of teams individual conferences had in the Top 60 and Top 30, there was some variation between actual end of season RPI rankings and my pre-season simulated rankings. On the other hand, for the highlighted and not-highlighted groups, the numbers of teams the groups had in the final real Top 60 and Top 30 are almost identical. Thus so far as the highlighted and not-highlighted groups are concerned, my pre-season full season simulation is a good predictor of how many teams the groups will have in the Top 60 and Top 30 in the final actual rankings.
This means it is fair to use my pre-season simulation for 2020-21 as a reasonable indicator of how many teams the highlighted and not-highlighted groups are likely to have in the Top 60 and Top 30 of the actual final rankings.
Here are two tables. The first table shows my 2020 pre-season simulation Top 60, after going through the three simulation process steps described near the top of this article. The second table shows what this means in terms of conferences and the highlighted and not-highlighted groups, comparing the simulated 2020 numbers to the average numbers since 2013.
Top 60 Test: The RPI Top 60 should include roughly 49 teams from the highlighted conferences and 11 teams from the not-highlighted conferences. The actual numbers can range on either side of these test numbers, but 45 teams should be the minimum from the highlighted group and 15 the maximum from the not-highlighted group.
Rather than the average 49-11 split, and the historically most extreme 45-15 split, between the highlighted and not-highlighted conferences called for by the test, the split is 29-31.
Top 30 Test: The RPI Top 30 should include roughly 28 teams from the highlighted conferences and 2 teams from the not-highlighted conferences. The actual numbers can range on either side of these test numbers, but 27 teams should be the minimum from the highlighted group and 3 the maximum from the not-highlighted group.
Rather than the average 28-2 split, and the historically most extreme 27-3 split, between the highlighted and not-highlighted conferences called for by the test, the split is 17-13.
No comments:
Post a Comment