Wednesday, July 26, 2017

How Many Top 60 Opponents Should We Schedule, for NCAA Tournament Purposes?

While most attention is on the 2017 season, I know that coaches are thinking about schedules throughout the year.  So, this is a little diversion from next month's first-of-the-season games to provide some information about how many Top 60 opponents a coach should try to schedule in order to achieve his or her NCAA Tournament seeding or at large selection goal.

To develop this information, I looked at all of the teams that received #1, #2, #3, and #4 seeds over the last 10 years, at all the teams that were not seeded but received at large selections, and at all the teams from the Top 60 that did not receive at large selections.  For each group, I determined the number of Top 60 teams they had played, using the end-of-regular-season (including conference tournaments) ARPI rankings (based on the current ARPI formula).  Here's what the numbers show:

Unseeded Teams That Received At Large Selections

3 teams @ 14 Top 60 Opponents (includes opponents played in conference tournaments)
11 @ 13
19 @ 12
32 @ 11
37 @ 10
39 @ 9
27 @ 8
38 @ 7
12 @ 6
12 @ 5
5 @ 4
1 @ 3
1 @ 2

The average number of Top 60 opponents, for the entire group, is 8.97, or essentially 9 Top 60 opponents.  For the teams in the Top 60 that did not get at large selections, the average was 6.41, or essentially 6.5 Top 60 opponents.  A point worth noting is that teams with fewer than 5 Top 60 opponents have only a very small chance of getting an at large selection.  And, teams with fewer than 7 Top 60 opponents have only about a 10% chance of getting an at large selection.  So, perhaps a reasonable rule of thumb would be to try to schedule so as to have at least the average that got at large selections -- 9 Top 60 opponents.

Also, from an historical perspective, teams with fewer than 2 Top 60 opponents do not get any at large selections.  And, one Top 60 team that played 14 Top 60 opponents did not get an at large selection, but all Top 60 teams with 15 or more Top 60 opponents not only got at large selections, they also got at least a #4 seed.

Teams That Received #1 Seeds

2 @ 18
3 @ 16
6 @ 15
2 @ 14
9 @ 13
4 @ 12
9 @ 11
2 @ 10
2 @ 9
1 @ 8

The average for #1 seeds is 12.80 or, essentially, 13 Top 60 opponents.  Interestingly, over the last 10 years all teams in the Top 60 that played 18 Top 60 opponents ("all" being 2 teams) received #1 seeds.  And, playing 7 or fewer Top 60 opponents has not been enough to get a #1 seed.

Teams That Received #2 Seeds

2 @ 16
1 @ 15
3 @ 14
7 @ 13
3 @ 12
6 @ 11
6 @ 10
6 @ 9
2 @ 8
3 @ 7
1 @ 6

The average for #2 seeds is 11.89 or, essentially, 12 Top 60 opponents.  All teams that played at least 16 Top 60 opponents received either a #1 or a #2 seed.  And, playing 5 or fewer Top 60 opponents has not been enough to get a #2 seed.

Teams That Received #3 Seeds

1 @ 15
1 @ 14
4 @ 13
10 @ 12
10 @ 11
7 @ 10
3 @ 9
3 @ 8
1 @ 5

The average for #3 seeds is 10.93 or, essentially, 11 Top 60 opponents.  Playing 4 or fewer Top 60 opponents has not been enough to get a #3 seed.

Teams That Received #4 Seeds

2 @ 15
1 @ 14
3 @ 13
5 @ 12
7 @ 11
8 @ 10
5 @ 9
5 @ 8
2 @ 7
2 @ 4

The average for #4 seeds is 10.18 or, essentially, 10 Top 60 opponents.  Playing 3 or fewer Top 60 opponents has not been enough to get any seed.

Thursday, April 27, 2017

Comparing the NCAA's 2015 ARPI and the 5 Iteration ARPI, 2009 BPs to the Committee's Actual Decisions

In the preceding post, I provided information about changes to the RPI that I've recommended to the Women's Soccer Committee.  The change is from the NCAA's current 2015 ARPI to what I call the 5 Iteration ARPI with the 2009 Bonus and Penalty regime.  As I discussed, the 5 Iteration ARPI, 2009 BPs system performs far better than the 2015 ARPI.

To add more information, I decided to compare the Committee's decisions on at large selections and seeds over the last 10 years to what the rankings would have been over those 10 years for the 2015 ARPI and for the 5 Iteration ARPI, 2009 BPs systems.  The purpose of this is to see which system's ratings match better with the Committee's actual decisions.

To do the comparison, I determined, for each rating system, the average rank of the teams the Committee gave at large selections, the average rank of the teams in the Top 60 (using the 2015 ARPI's ratings to determine the Top 60) to which the Committee denied at large selections, the average ranks of the teams to which the Committee gave #1, #2, #3, and #4 seeds respectively, and the average rank of the 16 teams to which the Committee seeds as a group.  The following table shows the results of the comparison:


Starting with the at large selections, the 5 Iteration ARPI, 2009 BPs ranked the teams to which the Committee gave selections 1.46 positions better than the 2015 ARPI.  What this means is that the Committee's at large selections matched better with the 5 Iteration version than with the NCAA's current version.

Moving on to the teams to which the Committee denied at large selections, the 5 Iteration version ranked those teams 3.73 positions more poorly than the 2015 ARPI.  Again, this means the Committee's decisions -- this time at large rejections -- matched better with the 5 Iteration version than with the NCAA's current version.

For seeds, for the #1 and #4 seeds, the Committee's decisions matched slightly better with the 5 Iteration version than with the NCAA's current version.  For the #2 and #3 seeds, on the other hand, the Committee's decisions matched slightly  better with the NCAA's current version.  When looking at seeds as a whole group, the Committee's decisions match very slightly better with the 5 Iteration version.

Looking particularly at the Committee's at large decisions, and assuming that those decisions in most cases were the right ones, the above numbers show that the 5 Iteration version's rankings come closer to being the right rankings than the current NCAA's version's rankings.  They certainly come closer to the decisions the Committee believes are the right decisions.

Tuesday, April 25, 2017

Proposal to Women's Soccer Committee for Changed RPI Formula

Recently, I gave the Women's Soccer Committee a proposal to change the RPI formula to address some of the current formula's major problems.  The proposal is to change to what I call the 5 Iteration Adjusted RPI using the 2009 Bonus and Penalty Points regime.  I'm not going to go into a detailed discussion of the proposed new formula here.  For details on the proposed new formula, you can go to the "RPI: Modified RPI?" page of the RPI for Division I Women's Soccer website.

As I show in detail at the "RPI: Modified RPI?" webpage, the 5 Iteration ARPI formula provides ratings that are at least as consistent with game results as the NCAA's ARPI versions.  More important:
  • The 5 Iteration ARPI rates conferences more accurately in relation to each other (1) in terms of general fairness and (2) in relation to conference strength.  In fact, the 5 Iteration ARPI eliminates the NCAA ARPI's biases in relation to conference strength.
  • The 5 Iteration ARPI rates the regional playing pools more accurately in relation to each other (1) in terms of general fairness and (2) in relation to region strength.  It doesn't eliminate the biases in relation to region strength of the NCAA's ARPI, but it significantly reduces the biases.
  • The 5 Iteration ARPI, for practical purposes, eliminates the disconnect that the NCAA's ARPI versions have, between teams' ARPI ranks and their ranks as contributor to opponents' strengths of schedule.  Thus the 5 Iteration ARPI will eliminate the incentive and need of coaches of potential bubble teams to try to "game" the system in their scheduling of non-conference opponents.  Under the 5 Iteration ARPI, an opponent's value in terms of contribution to your strength of schedule will be roughly the same as its actual rank.  This is as distinguished from the NCAA's RPI versions, where an opponent's value in terms of contribution to your strength of schedule can be very different than its actual rank.
Again, I cover all of this in detail at the "RPI: Modified RPI?" page.

For those of you who are coaches that have followed my work on the RPI, if you agree that a change to the proposed new formula would be a good idea, it will be very helpful if you will let any contact you have on the Women's Soccer Committee know you think the Committee should take a careful look at my proposed change.  This could be particularly helpful as, driven by basketball, the NCAA is in the process of taking a careful look at the RPI as well as at other rating formulas.  Thus there is an opening for changes now that has not been there in the past.

Here are the current Committee members, with email addresses:

Karen Hancock Oklahoma State: karen.hancock@okstate.edu

Janet Oberle, Saint Louis: oberlejl@slu.edu

Janet Rayfield, Illinois: rayfield@illinois.edu

Mick D'Arcy, Central Connecticut: darcym@ccsu.edu

Foti Mellis, California: fmellis@berkeley.edu

John McElwain, Sun Belt Conf: mcelwain@sunbeltsports.org

Tony da Luz, Wake Forest: daluz@wfu.edu

Shawn Farrell, Seattle: farrells@seattleu.edu

Chad Miller, Western Carolina: millerc@email.wcu.edu

Stephanie Ransom, Georgia: sransom@sports.uga.edu



Saturday, January 21, 2017

NCAA Tournament Bracket Formation: Personal Thoughts on the Factors Most Important to Decision-Making

In the previous group of posts, I've provided data on, and some observations about, what appear to be the most important factors influencing the Women Soccer Committee's decisions on NCAA Tournament at large selections and seeds.  I always like to provide the data so that others who are interested can review the data and reach their own conclusions on what the data mean.  I also have my own thoughts on what the data mean, so here they are:

1.  Not surprisingly and certainly not a new observation, teams' ARPI Ratings and Ranks are key factors in the decision-making process.  For one thing, they create a superstructure within which all decisions are made:
  • Teams with ARPI rankings of #58 or poorer do not get at large selections;
  • Teams ranked #30 or better appear secure in getting at large selections, although over the last few years it has become less clear where the "inner boundary" for protected teams is;
  • Teams must be ranked #26 or better to get a seed;
  • Teams must be ranked #19 or better to get a #3 seed;
  • Teams must be ranked #13 or better to get a #2 seed;
  • Teams must be ranked #7 or better to get a #1 seed.
In addition, many of the other factors that appear powerful in the Committee's decision-making are paired factors, with one element of the pair being a team's ARPI Rating or its ARPI Rank.

2.  For at large selections, teams' Top 50 Results Scores and/or Ranks, using a scoring system highly weighted towards good results against very good teams, when paired with the ARPI, is the most important factor.  This is not new information, but my current update work confirms it.  This is important not only for identifying the teams that do get at large selections, it's also important for identifying the teams that don't get selections.

3.  For at large selections, a second important factor is teams' Top 60 Common Opponent Results Scores and/or Ranks.  This year's update is the first time it's become clear to me how important this factor is in contributing to what teams do get at large selections.  It's not as important for what teams don't get selections.

4.  For at large selections, the Non-Conference ARPI (ANCRPI) Ratings and Ranks also are important.  This is the first time I've seen this so clearly.

5.  For at large selections, although a team's Conference Rating and/or Conference Rank do not appear to be highly important for identifying teams that do get at large selections, when paired with the ARPI they are an important factor for identifying teams that don't get at large selections.  There are a couple of possible explanations for this.  The more cynical explanation, that I doubt is the case, is that the Committee is biased in favor of the strongest conferences.  The other explanation, which I believe is more likely, is that the significance of the Conference Rating and/or Rank factor pattern is that it represents a difficult reality for teams from all but the strongest conferences.  Since their conference schedules are weaker than the conference schedules of teams from the strongest conferences, the teams from the weaker conferences tend to play weaker schedules.  With good results against Top 50 teams, represented by the Top 50 Results Score and Rank factors, as a major part of the decision-making, followed by Top 60 Common Opponent Results Scores and/or Ranks, most teams from weaker conferences have fewer opportunities to score well on those factors.  Thus their Conference ARPI/Rank probably are indicators of that problem.  This is why, for a long time, I have emphasized the importance for teams from mid-majors, that want to be successful in the competition for at large selections, to schedule a lot of very strong opponents for the non-conference parts of their schedules.  The West Coast Conference is an example of a mid-major that does this and has been successful in getting at large selections.  The Ivy League, unfortunately, is an example of a mid-major that doesn't do it and has been pretty unsuccessful in getting at large selections.

6.  For #1, #2, and #4 seeds, teams' Top 60 Common Opponent Scores/Ranks paired with teams ARPPI Ratings/Ranks are the most important in determining which teams do and don't get those seed positions.  This is not that surprising, as the teams in contention for seeds typically have significant numbers of games against each other, so a good measure for seeding purposes is how a team has done against the whole pool of seed competitors.

7.  For #3 seeds, teams' Conference ARPI/Rank is important.  It appears difficult for the Committee to make decisions, for teams they think should be seeded, between #3 and #4 seeds.  The importance of Conference ARPI/Rank suggests that the Committee, in deciding which should receive #3 seeds, has a tendency to default to teams from the strongest conferences.

8.  For the #4 seeds, teams' ANCRPI Ratings/Ranks also are important.

9.  Although teams' Top 50 Opponents Scores/Ranks are not particularly important in deciding the seeds that teams get seeds, they are important in deciding the teams that do not get #2, #3, and #4 seeds.

Overall, my conclusion is that fans and coaches, if they want to know their teams' NCAA Tournament seed and at large selection prospects, should be paying particular attention to (1) their ARPI Ratings and Ranks, (2) their Top 50 Results Scores and Ranks, and (3) their Top 60 Common Opponent Scores and Ranks.  Secondarily, they should pay attention to teams' Conference ARPIs and Ranks and their ANCRPI Ratings and Ranks.

On the other hand, the data suggest that teams' Head to Results and Last 8 Games Results are not as important.  It is possible the Head to Head Results relative unimportance represents a judgment that using one game's result is not a reliable basis for decision-making.

Friday, January 20, 2017

NCAA Tournament Bracket Formation: Most Important Factors for #4 Seeds

Continuing with my reports on the most important factors in the Women's Soccer Committee's decision-making on NCAA Tournament at large selections and seeds, here is a table showing the most important factors in deciding that "yes," a team gets a #4 seed:

Factor Yes 4 Seed
ARPI Rating and Top 60 CO 9
ANCRPI Rating and CO Score 8
ARPI Rating and Top 60 CO Rank 7
ARPI Rank and ANCRPI Rank 7
ANCRPI Rating and Last 8 Games (Poor Results) 5
CO Score and Last 8 Games (Poor Results) 5
ARPI Rank and Top 60 CO Score 5
ARPI 4
ARPI Rating and ANCRPI Rating 4
ARPI Rank and Top 60 CO Rank 4
ANCRPI Rank and CO Rank 4
ANCRPI Rating and CO Rank 4
CO Rank and Last 8 Games (Poor Results) 4
Conference Rank and CO Score 3
ANCRPI Rank and CO Score 3
ARPI Rank and ANCRPI Rating 3
ARPI Rating and ANCRPI Rank 2
ARPI Rank and Conference Rank 2
ARPI Rating and Last 8 Games (Poor Results) 2
Conference ARPI and HTH Score 2
CO Score and CO Rank 2
ANCRPI Rating and Conference Standing 2
Top 60 CO Score 2
Top 60 CO Rank 2
Conference Standing and Top 60 CO Rank 2
ANCRPI Rank and Conference Standing 2

This table shows the 26 most important factors in determining which teams get #2 seeds.  Here, various pairings of the ARPI, Top 60 Common Opponent results, and the Adjusted Non-Conference ARPI are the most important factors.  Looking through the list, this is an area in which the ANCRPI is at its most relevant.  Since the NCAA considers the ANCRPI to be more indicative of conference strength than the ARPI (if nevertheless less accurate), it may be that the ANCRPI comes into play here as one way of including conference strength in the consideration of which teams should receive at least some seed.

Also of note here, the factor patterns are more able to identify teams to receive #4 seeds than they are #3s.  As mentioned in my post on the #3 seeds, this is consistent with my personal observations that the Committee appears to have trouble distinguishing which teams should receive #3 seeds as compared to #4s.

The following table shows the 25 most important factors in determining which teams do not get #4 seeds:

Factor No 4 Seed
ARPI Rating and Top 60 CO Rank 367
ARPI Rating and Top 50 Results Rank 357
ARPI Rating 339
ARPI Rank and Rating 339
ARPI Rank 337
ARPI Rating and Top 50 Results Score 324
ARPI Rating and ANCRPI Rank 316
ARPI Rating and Top 60 CO 311
ARPI Rank and Top 50 Results Rank 273
ARPI Rating and Top 60 HTH 267
ARPI Rank and Top 60 CO Rank 260
ARPI Rating and Last 8 Games (Poor Results) 260
Top 50 Results Rank and CO Rank 258
HTH Score and Last 8 Games (Poor Results) 247
CO Score and Last 8 Games (Poor Results) 232
ARPI Rating and Conference Rank 231
ARPI Rank and ANCRPI Rank 223
ARPI Rank and Conference Standing 223
ANCRPI Rating and HTH Score 217
ANCRPI Rank and CO Score 216
Top 50 Results Score and Last 8 Games (Poor Results) 211
ANCRPI Rank and CO Rank 204
Top 50 Results Score and CO Rank 202
ANCRPI Rating and CO Score 201
ARPI Rating and Conference Standing 184
Here, ARPI Rating paired with Top 60 Common Opponents Rank is the most powerful factor, with ARPI Rating paired with Top 50 Results Rank being the next most powerful factor.  ARPI Rating and Rank alone as primary factors are near the top of the list.  By itself, ARPI Rank excludes all teams ranked #27 or poorer from receiving #4 seeds.  And clearly, the ARPI is the most important factor in determining #4 seeds, when paired with other factors.

NCAA Tournament Bracket Formation: Most Important Factors for #3 Seeds

Continuing with my reports on the most important factors in the Women's Soccer Committee's decision-making on NCAA Tournament at large selections and seeds, here is a table showing the most important factors in deciding that "yes," a team gets a #3 seed:

Factor Yes 3 Seed
ARPI Rank and Conference Rank 8
Conference Rank and CO Rank 6
ARPI Rank 4
Conference ARPI and HTH Score 4
Conference Rank and HTH Score 4
ANCRPI Rank and CO Score 3
ANCRPI Rating and CO Score 3
ARPI Rating and Conference ARPI 3
Top 50 Results Rank and CO Rank 3
ANCRPI Rating and CO Rank 3
ARPI Rating and Top 50 Results Score 2
ARPI Rating and ANCRPI Rank 2
ARPI Rank and Conference ARPI 2
ANCRPI Rank and Top 50 Results Score 2
Conference ARPI and CO Rank 2


This table shows the 15 most important factors in determining which teams get #2 seeds.  Here, the Conference Rank/ARPI Rank paired factor is the powerful, followed by the Conference Rank/Common Opponents Rank paired factor.  What this suggests is that when the Committee gets to the #3 seeds, the conference a team is in has grown in importance.  Also, here Head to Head Results becomes an influential factor, paired with Conference ARPI and Conference Rank.

Also of note here, the factor patterns are less able to identify teams to receive #3 seeds than they are for #1 and #2 seeds.  This suggests that the Committee has a harder time identifying teams for #3 seeds and needs to do more "guessing" about which teams should receive #3s.  This is consistent with my analyses of the Committee's specific #3 seed decisions.

The following table shows the 25 most important factors in determining which teams do not get #3 seeds:

Factor No 3 Seed
ARPI Rating and Top 50 Results Rank 423
ARPI Rank 416
ARPI Rank and Rating 404
ARPI 397
ARPI Rating and Top 60 CO Rank 389
ARPI Rating and Top 50 Results Score 382
ARPI Rank and ANCRPI Rank 377
ARPI Rank and Top 50 Results Rank 373
ARPI Rating and ANCRPI Rank 368
ARPI Rank and Conference Rank 353
ARPI Rank and Top 50 Results Score 344
ARPI Rating and ANCRPI Rating 316
ARPI Rating and Top 60 CO 311
ARPI Rating and Top 60 HTH 310
Conference Rank and CO Score 310
ANCRPI Rating and CO Score 306
Top 50 Results Score and CO Rank 297
HTH Score and Last 8 Games (Poor Results) 277
Conference ARPI and CO Score 275
ANCRPI Rating and HTH Score 273
Top 50 Results Score and Last 8 Games (Poor Results) 263
ANCRPI Rank and Top 50 Results Score 261
ANCRPI Rating and Last 8 Games (Poor Results) 261
ARPI Rank and Top 60 CO Rank 260
ARPI Rating and Last 8 Games (Poor Results) 260
Here, ARPI Rating paired with Top 50 Results Rank is the most powerful factor, with ARPI Rating paired with Top 60 Common Opponents Rank also being a powerful paired factor.  ARPI Rank alone as a primary factor is just next to the top of the list.  By itself, it excludes all teams ranked #20 or poorer from receiving #3 seeds.  Indeed, the ARPI clearly is the most important factor in determining #3 seeds, when paired with other factors.  I suspect that, in trying to distinguish among potential #3 and #4 seeds, the Committee has a hard time finding persuasive data one way or the other.  In that circumstance, perhaps it tends to default to the ARPI for purposes of eliminating teams from consideration.  I believe there's support for this possibility in the factors' being pretty good at identifying which teams will receive some seed, but not being so good at identifying which seed they will receive when it comes to the #3s and #4s.

NCAA Tournament Bracket Formation: Most Important Factors for #2 Seeds

Continuing with my reports on the most important factors in the Women's Soccer Committee's decision-making on NCAA Tournament at large selections and seeds, here is a table showing the most important factors in deciding that "yes," a team gets a #2 seed:

Factor Yes 2 Seed
ARPI Rating and Top 60 CO 12
ARPI Rank and Top 60 CO Score 10
Conference ARPI and CO Score 9
Top 60 CO Score 9
ANCRPI Rating and CO Score 8
Conference Standing and Top 60 CO Score 8
ARPI Rank and Top 60 CO Rank 8
ARPI Rating and Last 8 Games (Poor Results) 8
CO Score and Last 8 Games (Poor Results) 8
ARPI Rating 7
ARPI Rating and Top 60 CO Rank 7
Conference Standing and Top 60 CO Rank 7
Top 50 Results Rank and CO Score 7
Conference Rank and CO Rank 7
ARPI Rank and Top 50 Results Rank 6
ARPI Rating and Top 50 Results Rank 6
ARPI Rating and Conference Standing 6
ARPI Rank and ANCRPI Rating 6
ANCRPI Rank and Top 50 Results Rank 5
ARPI Rank and Conference Rank 5
ARPI Rank and Rating 5
ANCRPI Rank and Top 50 Results Score 5
ARPI Rank and ANCRPI Rank 5
Top 50 Results Rank and CO Rank 5
ANCRPI Rating and Top 50 Results Rank 5
ARPI Rank and Last 8 Games (Poor Results) 5

This table shows the 26 most important factors in determining which teams get #2 seeds.  As with the #1 seeds, teams' ARPI Ratings and Ranks, paired with their Top 60 Common Opponent Results, are the most powerful factors.  (And, once again, Head to Head Results do not appear on the list of the most powerful factors.)

The following table shows the 25 most important factors in determining which teams do not get #2 seeds:

Factor No 2 Seed
ARPI Rank and Top 60 CO Rank 477
ARPI Rating and Top 50 Results Rank 472
ARPI Rank and Top 60 CO Score 467
ARPI Rank and Conference Rank 460
ARPI Rank 457
ANCRPI Rank and CO Rank 455
ANCRPI Rank and CO Score 452
ARPI Rating and Top 50 Results Score 450
ANCRPI Rating and CO Score 443
ARPI Rating and Top 60 CO Rank 437
ARPI Rating and Conference ARPI 436
ARPI Rating and Last 8 Games (Poor Results) 434
ARPI Rank and Rating 434
ARPI Rank and ANCRPI Rank 431
Conference Rank and CO Rank 423
Top 50 Results Rank and CO Score 422
ARPI Rating 421
HTH Score and Last 8 Games (Poor Results) 411
Top 50 Results Rank and CO Rank 409
ARPI Rank and Top 50 Results Rank 408
CO Score and Last 8 Games (Poor Results) 407
Top 60 CO Rank 407
ARPI Rating and Conference Rank 406
ARPI Rating and Top 60 HTH 405
ARPI Rating and ANCRPI Rank 403
Again, ARPI Rank paired with Top 60 Common Opponents Rank is the most powerful factor, with ARPI paired with Top 50 Results Rank and paired with Conference Rank also being powerful factors.

ARPI Rank, by itself, is the most powerful single factor.  By itself, this factor excludes all but the top 14 ARPI teams each year from getting a #2 seed, just as it excludes all but the top 7 ARPI teams from getting a #1 seed.