Wednesday, August 31, 2016

2016 Season Simulation: Important Limitation to Keep in MInd

When you're looking at the 2016 Season Simulation, there's an important limitation to keep in mind.

An underlying assumption in the simulation is that, if Team A's simulated rating is more than 0.0134 higher than Team B's rating, then when they play each other Team A will win.  In real life, however, it doesn't work that way.  There are significant numbers of upsets.  As a result of this, it is likely, on average, that teams the simulation shows with really good records won't have records that good -- they'll end up losing some of the games the simulation says they will win.  Likewise, teams the simulation shows with really poor records likely won't have records that poor -- they'll likely win some of the games the simulation says they will lose.  Similarly, some of the simulated ties will be win/loss games; and some of the simulated win/loss games will be ties.  On the other hand, teams in the 50-50 win/loss area may well end up right about there, on average -- they likely will lose some games they should win, but the also likely will win about the same number of games they should lose.

On average, this won't affect teams from the power conferences and stronger teams from the mid-majors the same way.  Looking, say, at the top 100 teams in the Simulation, the teams from mid-major conferences tend to rely on their own high winning percentages in achieving their high rankings, as distinguished from their strengths of schedule.  On the other hand, teams from the power conferences tend to rely on their high strengths of schedule, as distinguished from their winning percentages.  In other words, in the Simulation, top 100 mid-major teams are going to have a lot more wins than losses, as compared to top 100 teams from the power conferences.  Since the mid-major teams have significantly more Simulation wins than losses, the actuality of upsets will erode their winning percentages significantly; whereas since the power conference teams do not have such one-sided Simulation win/loss records, the actuality of upsets will not erode their winning percentages as much and their rankings aren't as dependent on their win/loss records anyway.  Here's an example of what I mean, from the Week 2 Update, using an assumed upset rate of 25%:

Kent State, #44 in the simulation, has a simulated final record of 20-1-0.  In actuality, their record is likely to be 15.25-5.75-0, which rounds off to 15-6-0.

Kentucky, #45 in the simulation, has a simulated final record of 8-7-5.  In actuality, depending on how I treat simulated ties, Kentucky's actual record is likely to round off to either 8-7-5 or 8-8-4.

Kentucky's strength of schedule is much better than Kent State's.  Thus once the games actually are played, Kent State is likely to drop well below Kentucky in the rankings, instead of being above them.

What this means, especially as you look at the early updates of the simulation, is that the simulation rates the stronger mid-major teams significantly higher than they are likely to be rated once they've actually played their games.  So, you're likely to see them dropping down in the rankings, as I do new simulation updates from week to week.  On the other hand, for the power conference teams, especially those in the 30+ ranking area, you're likely to see them stay more in place or even rise in the rankings to fill spaces previously occupied by mid-majors.

There's another simple illustration of this that uses past seasons' experience:  The the 2016 Season Simulation: Week 2 Update shows 22 Automatic Qualifiers (conference champions) among the top 60 teams in the rankings.  On the other hand, the average number of Automatic Qualifiers among the top 60 teams, over the past nine years, has been 15.44, with the highest number in any year being 18.  Thus as actual results occur, some of the currently simulated Automatic Qualifiers in the top 60 are going to drop down out of the top 60 rankings.

