Saturday, December 10, 2016

When to Switch Simulation from Using Pre-Season Assigned Ratings to Using Actual Current Ratings?

As those who followed my 2016 season simulations know, prior to the season I assigned ARPI ratings to all the teams, using Chris Henderson's in-conference rankings of teams and ratings I then assigned to the teams in each conference based on what the ratings had been for teams so ranked within their conferences over the last two years.  I used those ratings as the basis for simulating the entire season's results and ending ratings.  I substituted actual results for simulated results weekly as the season progressed, while retaining the simulated results for games not yet played.  Thus my weekly simulation reports were hybrids of actual and simulated results.

At a point during the season, I changed how I simulated results for games not yet played.  Instead of using my assigned pre-season ARPI ratings to simulate future results, I began using each week's then current actual ARPI ratings.

As part of my recent study, discussed in detail in my preceding post, I looked to see what is the best time to switch from using assigned pre-season ARPI ratings for the weekly simulations to using current actual ARPI ratings.  The following table shows the results of this portion of the study:

Rating System Correct Within Half Correct or Within Half

End of Season
68.9% 18.7% 87.7%

Week 11
69.2% 18.7% 87.9%

Week 10
68.8% 18.9% 87.8%

Week 9
68.6% 18.9% 87.5%

Week 8
68.1% 18.9% 87.0%

Week 7
66.8% 18.6% 85.4%

Week 6
65.4% 19.4% 84.8%

Week 5
63.8% 19.7% 83.5%

Week 4
62.5% 18.1% 80.5%

Model for 2017
57.4% 21.9% 79.2%

Prior Year's ARPI
58.3% 19.8% 78.1%

CT Model Using CH
57.9% 20.0% 77.8%

Week 3
59.7% 16.7% 76.3%

This table shows how well each set of ratings matches up with actual game results over the course of the season -- in other words, how accurate each system is.
  • The Week 3 through Week 11 and End of Season ratings are teams' actual ARPI ratings as of each week throughout the 12-week season.  The Model for 2017 ratings are the ones I've settled on as initial ratings for my 2017 season simulations, as described in my preceding post.  The Prior Year's ARPI ratings, in my study, are the 2015 end-of-season actual ARPI ratings.  The CH Model Using CH ratings are the ones I used for the 2016 season simulation.
  • The "Correct" column shows the % of simulated results that match the actual results, for the entire season.
  • The "Within Half" column shows (1) the % of games simulated as ties that actually were wins/losses, plus (2) the % of games simulated as wins/losses that actually were ties.  In these cases, I consider that the simulated results were within a "half" of the actual result.
  • The "Correct or Within Half" column is the total of the two preceding columns.  The difference between this percentage and 100% is the games that the rating system got flat out wrong -- a simulated win in fact was a loss.
I arranged the bottom part of the table so that the rating methods are in order based on the percentages of "Correct or Within Half."  I consider this the best measure of how "good" a method is for simulation purposes.

As the table shows, teams' actual ARPIs are the best ratings to use for simulating future game results starting with Week 4 of the season.  Prior to that, the Model for 2017 ratings are the best -- those being the ones I selected based on the work I described in the preceding post.

An interesting piece of "bonus" information from this work is its indication that the accuracy of actual ARPI ratings improves only slightly from week 8 until the end of the season and, for practical purposes, doesn't improve at all from Week 10 through the end of the season (Week 12).  Thus while teams' ratings may change over this period, one set of those weeks' ratings is just as good as another in terms of the relationship between ratings and actual results.  To me, this is a surprise.

For visualization purposes, here's a chart of the results in the above table.  The chart's a little fuzzy, so here's a link to a clearer version.  The lower red line is the % of simulated results that are within a "half" of being correct based on actual results; the blue line is the % that are correct; and the upper grey line is the sum of those two percentages.

No comments:

Post a Comment