Thursday, August 10, 2017

Pre-Season Simulated RPI Ranks and NCAA Tournament Bracket

Now that all the teams' schedules are available, it's possible to provide simulated full-season data and Adjusted RPI ranks and a simulated NCAA Tournament bracket for the 2017 season.  I'll provide those in the two posts following this one.

I have a big caution about the simulations.  Don't get happy if they show your team doing well and don't get depressed if they show your team doing poorly.  They're pretty far from where the season will end up, as will be my weekly updated simulations until we get well into the season.

Here's how the system works:

First, I assign each team a pre-season ARPI rating.  The pre-season rating I assign to a team is that team's average ARPI rating (using the current ARPI formula) over a prior number of years.  The number of years that go into the average depends on how long the team's current head coach has been head coach -- the longer the coach has been head coach, the more prior years' ratings go into the average.  To be exact, for teams whose coaches have been in place for 9 or more years as of the 2016 season, the teams' pre-season ratings are the teams' average ARPI over the last 8 years; with coaches in place for 4 to 8 years, the pre-season ratings are the teams' averages over the last 6 years; and with coaches in place for 3 or fewer years, I use the averages over the last 3 years.  I use this method for assigning pre-season ratings because a study I did last year indicated those ratings, as compared to other possible ratings, will come the closest to matching up with actual game results.  This doesn't mean those ratings do a great job, because they don't, it just means they do a better job than other rating systems I considered.

Once I have teams' pre-season ratings, for each game I compare the two opponents' pre-season ratings, adjusted for game location, to produce the simulated game result (win, tie, or loss).  I do this for the entire schedule of games to produce simulated game results for the entire season.  The game location adjustment for the home team is an addition of 0.0060 to its pre-season ARPI rating and for the away team is a subtraction of 0.0060.  (I've done a study that shows that, for the current version of the ARPI, this is the average value of home field advantage.)  Based on the location-adjusted ratings, my system treats the higher rated team as the winner if its rating margin over the opponent is greater than 0.0052.  If the margin is 0.0052 or less, the system treats the game as a tie.  I use 0.0052 as the win/tie cutoff point because when the margin is 0.0052 or less, the higher rated team wins significantly fewer than 50% of the games:  to  be precise, the higher rated team wins 42.9% of the time, ties 17.6% of the time, and loses 39.5% of the time, so the games essentially are toss ups.

Once I have all the game results, I then compute teams' ARPI ratings to obtain the full season's simulated ARPI ratings, ranks, and so on.

As we go through the season, each week I replace my simulated game results with the actual results for that week's games.  Then, starting with the fourth week of the season, I stop using the pre-season ARPI ratings to produce the simulated results of future games and instead use teams' then-current actual ARPI ratings.  I do this beginning in the fourth week because the study I did indicates that is the point at which teams' actual ratings are more likely than the pre-season ratings to match game results.

I continue the weekly process of replacing simulated game results with actual results and updating teams' ARPI ratings as a basis for producing future game results, through the balance of the season.  What this means is that each week's simulated end-of-season ARPI ratings will come closer to what teams' actual end-of-season ratings will be.  And, when the regular season including conference tournaments is over, the simulated ratings will be the same as teams' then-current actual ratings.

For the end-of-season conference tournaments, each week I do simulated conference tournament brackets and results, with the seeding in each conference's bracket based on the simulated results of conference regular season games.

Regarding the NCAA Tournament bracket, I take each week's simulated end-of-season results and other data and plug them into a program I've developed that matches certain data with the Women's Soccer Committee's decisions over the last 10 years.  Based on how teams' simulated results match up with those data, I then determine who would get #1, #2, #3, and #4 seeds and at large selections for the NCAA Tournament if the Committee were using the simulated results and following its last 10 years' decision patterns.  I also indicate who are other potential seed and at large selections based on the week's simulation.  As with the ARPI ratings, with each passing week, the bracket simulation will come closer to what will happen if the Committee follows its historic patterns.

Again, as a reminder, don't take the initial simulations seriously.  They have a logical basis, but will produce some results that will look crazy.  I'll explain some of the craziness as we go through the season.  In addition, feel free to post a question or comment if you see something that doesn't look right and I'll see if I can explain why things came out that way.

No comments:

Post a Comment