RPI and Bracketology for D1 Women's Soccer Blogspace: SO YOUR CONFERENCE WANTS MORE OF ITS TEAMS IN THE NCAA TOURNAMENT: AS A GROUP, HOW MUCH ATTENTION SHOULD ITS TEAMS PAY TO THE RPI FORMULA? ANSWER: A LOT

"What is the best way to improve a team or conference RPI? The simple, but correct answer to this common question is schedule and beat non-conference teams ranked higher in the RPI."

Jim Wright, NCAA Director of Statistics

Rick Campbell NCAA Assistant Director of Statistics

In March 2008, the NCAA's Frequently-Asked Questions About the Women's Soccer Rating Percentage Index included the above question and answer. When I look at how teams do their non-conference scheduling today, it seems like this is the advice many follow. But there's a problem: For an individual team scheduling in isolation, the advice seems good. But for a conference whose teams are scheduling as a group, the advice is wrong. This article will explain why.

In my previous two articles, I showed that if a team has NCAA Tournament aspirations, it's very important that the team pay attention to the RPI formula when doing its non-conference scheduling.

In this article, I'll show that if a currently second tier conference aspires to have more than one team in the NCAA Tournament, it's very important that the conference teams as a group pay attention to the RPI formula when doing their non-conference scheduling. By "second tier" I mean the top non-Power 5 conferences such as the Big East, American, Ivy, West Coast, Big West, Conference USA, and Colonial conferences.

I'll start with some points about the RPI formula and go from there to some tests I've done.

The RPI Formula.

Element 1. Element 1 of the RPI formula is a team's winning percentage. The effective weight of Element 1, in relation to a team's ultimate ratings, is 50%. So, teams' winning percentages are very important.

From a conference perspective, winning percentage breaks into two sets of games: conference games and non-conference games. For conference games, a conference's overall winning percentage always is 0.500 -- for every conference team win there is a matching conference team loss and for every tie by one conference team there is a matching tie by another conference team. Although this seems elementary, it's important: If conferences played only conference games, then so far as Element 1 is concerned, all the conferences would be equal since they all would have a winning percentage of 0.500.

Where the conferences distinguish themselves from each other, for Element 1, is in their non-conference games. Looking at the 2018 season, here are the conferences' winning percentages in non-conference games, in order from best to worst:

As you can see, there is a clear separation of the "Power 5" conferences from the other conferences. Why? You might say, "Well, their teams, on average, are better." Although that might (or might not) be true, it isn't the right answer. The right answer is, "They scheduled smarter, in relation to Element 1, than the other conferences."

Take the Ivy, the American, the Big East, and the West Coast conferences, which are the next group in the table. They could have scheduled non-conference opponents so as to have had better non-conference winning percentages, simply by scheduling weaker opponents than they actually scheduled. They just didn't do it. Put bluntly, they scheduled less smartly, in relation to Element 1, than the Power 5 conferences.

Element 2. Element 2 of the RPI is the average of my opponents' winning percentages (against teams other than me -- which is an RPI detail you can ignore). The effective weight of Element 2, in relation to a team's ultimate rating, is 40%. So, my opponents' winning percentages also are very important.

Here again, I need to consider my conference opponents as one group and my non-conference opponents as another.

Suppose I'm a Big Twelve team. Then looking at the above table, on average for each of my 9 conference games my Big Twelve opponent is going to contribute an 0.6944 winning percentage, merged together its conference winning percentage, to my Element 2. On the other hand, suppose I'm a West Coast Conference team. Then on average for each of my 9 conference games my opponent is going to contribute only 0.5550, merged together with its conference winning percentage, to my Element 2. Simply put, my West Coast Conference compatriots didn't schedule as smartly, in relation to my Element 2, as my alter ego's Big Twelve conference compatriots. And, let's be clear, those games' contributions to my Element 2 are going to be about half of all the Element 2 contributions I'll receive, so they matter a lot.

So, I want my fellow conference teams to schedule smartly, in relation to my Element 2. Then I have to be willing to do the same for them. In other words, I have to schedule so as to maximize my non-conference winning percentage.

If I want to schedule to maximize my non-conference winning percentage, then I need to identify a list of opponents I'm confident I can beat. But, it doesn't end there. I also want my non-conference opponents to make decent contributions to my Element 2. As I showed in the preceding two articles, teams in the same ranking area can make quite different contributions to their opponents' Element 2s. So, from my list of opponents I'm confident I can beat, I need to pick the ones that will make the best contributions to my Element 2 (rather than the ones that are the best ranked). I then will have (1) helped my conference compatriots by maximizing my non-conference winning percentage and (2) helped myself by playing opponents from my list that are the best contributors to my own Element 2.

Element 3. Element 3 of the RPI is the average of my opponents' opponents' winning percentages. The effective weight of Element 3, in relation to a team's ultimate rating, is 10%. So, Element 3 matters some, but not a lot.

There is one feature of Element 3, however, that is good to consider. If I'm Texas, and I play Oklahoma, then all of Oklahoma's opponents contribute to my Element 3. And, Oklahoma's opponents include all of my other Big Twelve compatriots. So, just as I want my Big Twelve compatriots to make good contributions to my Element 2 when I play them, I want them to make good contributions to my Element 3 via their games with the other Big Twelve teams that I will play.

The Tests.

OK, so what I've said above is that, from an RPI perspective, I want all the teams in my conference to schedule so that they're highly likely to win each of their non-conference games. And, for their non-conference opponents, from the list of teams they're highly likely to beat, I want them to pick the teams that will make the best contributions to their Elements 2.

It sounds good to me in theory, but would this scheduling system really work? I conducted two tests, with the Big East as my guinea pig conference. The tests show that it would improve the Big East teams' RPI rankings. In fact, the amount of improvement is stunning.

Test 1

I used the entire Division I 2018 games and results data base for the test, with one exception. The exception is that I changed the opponents for all of the Big East teams. Here's how I changed them:

1. For each team, I started with its actual 2018 ARPI rating. I know from studies I've done that if my team's ARPI rating is roughly 0.0604 better than my opponent's rating (with the ratings adjusted for home field advantage), then my team will win roughly 75% of the time and as the rating difference increases, my winning percentage will increase. So, I subtracted 0.0604 from each team's ARPI rating and identified the pool of all teams with ratings poorer than that number as the team's potential non-conference opponents. I did this to assure that each team's chance of winning each non-conference game would be 75% or better.

2. Thinking of myself as the scheduler, I then went through a process for my team, to see which teams from my pool would be the best to play from an RPI strength of schedule contribution perspective. I know the rankings of all teams as contributors to their opponents' strengths of schedule, so for my team's pool of potential opponents I put the teams in order from the best strength of schedule contributor to the worst. Big East teams ordinarily play 9 non-conference games, so ideally I would pick the 9 best strength of schedule contributors as my team's non-conference opponents. I wanted the test to be realistic, however, and simply picking the 9 best contributors wouldn't have been geographically realistic, so instead of the 9 best contributors I picked the 9 best that looked geographically reasonable. I did this for each Big East team. It was easier for some teams than others, due to their geographic locations. At the end of this step, I had a full set of 9 non-conference opponents for each team.

3. With my list of 9 non-conference opponents for each Big East team, I then compared each Big East team's ARPI rating to each of its opponents' ratings to determine the Big East teams' likelihood of winning each non-conference games. Remember, for the opponents I selected, the win likelihood always is 75% or better. With these likelihoods I then determined the average Big East win/loss/tie likelihood for all 90 of the non-conference games. The result was that I could expect that each team, from its non-conference schedule, would win 80%, lose 10%, and tie 10%. This, of course, didn't fit well with 9 non-conference games, so I decided to be conservative: I would set the game results so that each Big East team won 7 of its non-conference games, lost 1, and tied 1.

4. With each team's non-conference opponents and game results in hand, I then deleted all of the teams' actual 2018 non-conference games from the 2018 schedule and replaced them with my test non-conference games and results.

5. Now came the fun part: I gave Excel a Calculate command.

With these Big East schedule changes, here are the top nine conferences' winning percentages in non-conference games, in order from best to worst:

The Big East now has the best conference winning percentage. So, what does this do for its teams' rankings?

For the actual 2018 season, here is what the Big East teams' ARPI rankings and their rankings as strength of schedule contributors were:

With their substituted schedules, here are what the teams' numbers would be:

As I said, the results of this changed approach to non-conference scheduling are stunning. The Big East's average ARPI rank jumped 45 positions and its average rank as a strength of schedule contributor jumped 73 positions. It went from three Top 60 teams to seven. It went from one sure bet team for an NCAA Tournament berth plus two bubble teams to a likely six in the Tournament plus one bubble team.

Test 2 Method

Test 1 is a "looking backwards and asking 'What if we had done this?'" kind of test. Test 2 is a "looking forward and asking 'What if we do this?'" test.

In this test, I went through the same process as I did for Clemson as described in the article "So You Want an At Large Position in the NCAA Tournament: How Much Attention Should You Pay to the RPI Formula, In Your Non-Conference Scheduling? Answer: A Lot," which is two articles above this one. I won't repeat the explanation of the method here, other than to say:

It uses the 2018 games data base (but not the 2018 results).
It uses my assigned 2019 simulated ratings to determine game outcomes.
I replaced all of the Big East non-conference opponents with the same opponents I used in Test 1.

The way this test works, the system treats all games where the location-adjusted rating difference between teams is less than 0.0150 as ties. It treats all other games as wins by the better rated team. It then applies the ARPI formula to all of the game results.

Here are the Big East team's ARPI rankings under this test:

These results are in the same ball park as in Test 1, but show the Big East's average ARPI rank as 10 positions better than in Test 1.

Conclusion.

If a conference's teams, as a group, are determined to maximize their ARPI ranks, then they need to base their non-conference scheduling on how the RPI formula works. Each conference team would schedule opponents it is highly likely to beat and, of potential opponents that meet that criterion, they would pick those opponents likely to be the best strength of schedule contributors under the RPI formula.

As probably is obvious, this means the conference would schedule non-conference opponents in a way that is very different than what it's done in the past. Whether a conference would want to do this, once it sees what it's list of non-conference opponents looks like, is something I can't answer. There might be criticisms that it "isn't right" for a conference's teams to systematically schedule only non-conference opponents they can beat.

On the other hand, if the name of the game is getting your conference's teams into the NCAA Tournament, the NCAA, in adopting the RPI as the rating system the Women's Soccer Committee must use, has defined the rules of the game. Those rules say this is how you should schedule. If critics don't like it, the correct response is to change the rules of the game, which would mean replacing the RPI with something better.

RPI and Bracketology for D1 Women's Soccer Blogspace

Saturday, February 23, 2019

SO YOUR CONFERENCE WANTS MORE OF ITS TEAMS IN THE NCAA TOURNAMENT: AS A GROUP, HOW MUCH ATTENTION SHOULD ITS TEAMS PAY TO THE RPI FORMULA? ANSWER: A LOT

No comments:

Post a Comment