Friday, August 17, 2018

2018 PRE-SEASON FULL SEASON SIMULATION: HOW I DO IT

With all of the teams' schedules now in my data base, I can do a simulation of the entire upcoming regular season (including conference tournaments).

Below, I'll explain how the simulation works, including the assumptions it uses.

In the next post, I'll show the simulated records, ratings, and rankings for all teams.  In the post after that, I'll show the simulated regular season standings within each conference and the conference tournament results.  And, in the post after that, I'll show the simulated NCAA Tournament seeds and other participants, both automatic qualifiers and at large selections.  I'll also show teams ranked in the Top 60 but not getting at large selections.

Here's how the simulation works:

1.  I assign a pre-season Adjusted RPI to each team.  The computer then determines the result of each game, based on these ratings as adjusted for home field advantage.  Using the entire season's results, the computer then generates simulated end-of-season ARPI ratings and ranks for the teams, based on the NCAA's ARPI formula.

2.  At the end each conference's regular conference season, the computer determines the conference ranks of the teams and seeds them into the conference tournament bracket.  It then conducts the conference tournaments and adds the games into the end-of-regular-season ARPI ratings and rankings (both regular season and conference tournament games).

3.  For a team's pre-season ARPI, I use a combination of the team's long-term rating trend and its average ratings over a period of years.  I've tried a great number of formulas involving different rating trend periods and average ratings over different periods, and the formula I use produces ratings that, on average, are closest to what the actual ratings turn out to be.  The elements of the formula I use are:

Predicted rating for this year based on a straight line trend of the team's ratings over the last eight years (Trended Rating)

Average rating over the last two years (Average Rating)

The specific formula is:

(Trended Rating + Average Rating)/2

In this formula, the Trended Rating is more of a "Where has the program been headed over the long term?" factor.  The Average Rating is more of a "How has the team done with the current players?" factor, with more consideration given to this year's seniors and juniors and less to this year's sophomores.

The formula gives no consideration to the specific players that graduated or transferred out; and no consideration to the specific incoming freshmen or players transferring in.  This can be a problem for teams with major changes in these areas, but since the simulation is strictly based on past ratings, there's nothing I can do about that.

4.  For each game, I start with each team's pre-season rating and, for each home/away game, adjust the home team's rating up 0.0070 and the away team's down 0.0070.  (In other words, home field advantage is worth 0.0140).  This is the average value of home field for the NCAA's current ARPI formula.  In real life, the value of home field advantage varies from team to team and game to game, but the simulation is not able to take that into account.

5.  Based on the location-adjusted ratings, the computer then determines a win-loss-tie outcome for each game.  If the rating difference between two opponents is greater than 0.0214, then the team with the better rating wins.  If the rating difference is less than or equal to 0.0214, then the game is a tie.  The tie rating difference covers the closest 20% of all games.  I selected the closest 20% because that cut off point will produce results that are closer to what the actual results will be than other cut off points.

6.  For conference tournament games in which the simulated results are ties so that the games go to Kicks from the Mark, I look at the then current ratings of the teams that the simulation has produced, based on all the simulated game results to that point in time.  (NOTE:  This is different than the pre-season simulated rating.)  The team with the better rating advances on KFTM.

7.  As the season progresses, each week I substitute that week's actual game results for the pre-season simulated results.  This means that week-by-week, the simulation comes closer to how the season actually will end up.  After teams have played their week 6 games, I change the ratings I assign to teams, for simulating future results, to their current actual ARPI ratings.  I do this because up until week 6, teams' pre-season simulated ratings do a better job than the actual current ratings at matching up with actual game results.  After teams have played their week 6 games, the current actual ratings do a better job.

Obviously, this simulation system starts out with significant limitations.  These limitations diminish over the course of the season.  So, take it with a very big grain of salt, especially early in the season.

No comments:

Post a Comment