Thursday, December 30, 2021

NCAA TOURNAMENT: ANALYZING THE COMMITTEE SEED AND AT LARGE DECISIONS

Each year, I analyze the Committee seed and at large selection decisions in relation to the Committee’s historic decision patterns.  This year, I have set up a series of tables for the analysis.  I will start with the #1 seed table, giving a detailed explanation of its data.  Then, I will show the tables for each of the #2, #3, and #4 seeds and for the at large selections, with a few comments.

For each table, you will need to scroll to the right to see the entire table.

#1 Seeds

Here is the #1 seed table:


First, some general background to help with the table.

For at large selections, the NCAA has set specific data factors from the season that the Committee must use and is limited to using.  The NCAA leaves it up to each Committee member to decide how much weight to assign to each factor.  The Committee also uses the factors in the seeding process, but for seeding they are not mandatory and the Committee is not limited to the factors in evaluating teams.

I break the NCAA factors down into 13 individual factors:
  • RPI (adjusted)
  • RPI Rank
  • Non-Conference RPI (adjusted)
  • NCRPI Rank
  • Top 50 Results (my modification of an NCAA factor)
  • Top 50 Results Rank
  • Conference Standing (I use average of regular season standing and conference tournament finishing position)
  • Conference RPI
  • Conference RPI Rank
  • Head to Head Results Against Top 60 Opponents (my modification of an NCAA factor)
  • Results Against Common Opponents with Other Top 60 Teams (my modification of an NCAA factor)
  • Common Opponent Results Rank (my modification of an NCAA factor)
  • Poor Results (my modification of an NCAA factor)
There are NCAA data available that allow an evaluation of a team for each factor.  For some of the factors, the NCAA has a scoring system.  For example, an NCAA formula assigns a value for the RPI.  Where the NCAA does not have a scoring system for a factor, I have created my own scoring system.

In addition, to mimic how a Committee member might think, I pair each factor with each other factor.  For each factor pair, I have a scoring system that gives each factor a 50% weight.

In addition, I have one other factor, Number of Games Against Top 60 Opponents.  This is not an NCAA mandated factor but rather is one I use as an aid to teams wishing to do their non-conference scheduling with a view towards the NCAA Tournament.

Altogether this produces a total of 92 individual and paired factors.

For each Top 60 team for each year since 2007 (the first year I began collecting data), my computer program has computed a score for each of the 92 factors.  (Hereafter, I will refer to my computer program and process as my "system.")  My system then compares the scores of all of the Top 60 teams since 2007 to the Committee seeding and at large selection decisions.  From this comparison, the system identifies two scores for each factor:

A "Yes" score, which means that for a particular Committee decision, if a team has had the Yes score or better for that factor, the team always has gotten a favorable decision from the Committee.  For example, if a team has had an RPI Rank of 1, it always has gotten a #1 seed.  I call such a Yes score the Yes standard for that factor.  Thus <=1 is the RPI Rank Yes standard for a #1 seed.

A "No" score, which means that for a particular Committee decision, if a team has had the No score or poorer for that factor, the team never has gotten a favorable decision from the Committee.  For example, if a team has had an RPI Rank of 8 or poorer, it never has gotten a #1 seed.  Thus >=8 is the RPI Rank No standard for a #1 seed.

Note:  As the data turn out, there are some factors, for some Committee decisions, that do not have a Yes standard or that do not have a No standard.  For example, a team’s Conference Standing does not have a Yes standard for any Committee decision.  In other words, your conference standing, all by itself and without reference to what conference you are in, will not assure you of any seed position or of an at large selection.

On completion of the regular season including conference tournaments, my system tallies up all of the Yes and No standards a team has met, for each Committee decision -- #1, 2, 3, and 4 seeds and at large selections.  For each team, for each Committee decision, there are four possible outcomes:

  • The team meets one or more Yes standards and no No standards.  This means that if the Committee follows its historic pattern, the team will get a Yes decision from the Committee.
  • The team meets no Yes standards and one or more No standards.  This means that if the Committee follows its historic pattern, the team will get a No decision from the Committee.
  • The team meets no Yes standards and no No standards.  This means that either a Yes or a No decision from the Committee will be consistent with its historic pattern.
  • The team meets one or more Yes standards and one or more No standards.  This means that the team has a profile the Committee has not seen historically.  Whatever decision the Committee makes cannot be fully consistent with its historic pattern.
With that background, the above table for #1 seeds shows data related to each of the teams with RPI ranks #1 through 7.  This is the candidate group for #1 seeds, since the No standard for a #1 seed is >=8.

Committee Decision: Green means the Committee gave the team a #1 seed.  Red means it did not.

RPI Rank 

Top 50 Results Rank:  I have included this in the table because historically the factor pair of RPI Rank and Top 50 Results Rank has proved to be a good predictor of Committee decisions, especially for at large selections.

Yes Standards Met and No Standards Met:  This is the number of Yes and No standards for #1 seeds that the team has met.

Yes Standard: If a team has met one or more Yes standards but does not get a Yes decision from the Committee, it is useful to know what the Yes standard is, so I list it here.  For example, Florida State had an RPI Rank of #1.  Suppose the Committee had not given it a #1 seed.  Then in this column you would have seen 2 RPI Rank (the 2 preceding RPI Rank simply is a number I have assigned to that standard).

Yes Value:  If I have listed a Yes Standard, I will state the standard score here.  For example, for a #1 seed, the RPI Rank Yes standard is <=1, which means that teams with RPI ranks of #1 always have gotten #1 seeds.  If the Committee had not given Florida State a #1 seed, you would have seen <=1 in this column.

Yes Actual:  If I have listed a Yes standard, I also state the team’s actual score for the standard.  So, if the Committee had not given Florida State a #1 seed, you would have seen 1 in this column representing its RPI rank.

No Standard.  If a team has met one or more No standards but gets a Yes decision from the Committee, it is useful to know what the No standard is, so I list it here.  For example, the RPI Rank No standard for a #1 seed is >=8.  If the Committee had given a #1 seed to the #8 RPI team, I would have listed 2 RPI Rank here.

No Value. If I have listed a No standard, I will state the standard score here.    If the Committee had given the RPI #8 team a #1 seed, you would have seen >=8 in this column.

No Actual:  If I have listed a No standard, I also state the team’s actual score for the standard.  So, if the Committee had given the #8 team a #1 seed, you would have seen 8 in this column.

Teams Affected:  If the Committee has given a No to a team that has met a Yes standard or a Yes to a team that has met a No standard, this column will give an indication of how significant a change the Committee has made from its historic pattern for that factor.  For example, since 2007 through 2019 there were 13 teams with #1 RPI ranks and they all received #1 seeds.  If the Committee had not given Florida State a #1 seed this year, then you would have seen 13 in the Teams Affected column.  This would mean that there are 13 teams that, based on past history, we would have thought assured of #1 seeds but that, based on the Committee decision this year, no longer could be considered as having been assured of #1 seeds.  The lower the Teams Affected number, the smaller the Committee change from its historic pattern.  If the Teams Affected number is 0, it means the team has a score for the factor and Committee decision that is just next to the historic standard and that the Committee has not seen before so that the Committee decision simply represents a refinement of the previous standard.

As a further note about Teams Affected, to give the numbers some context:

  • The #1 seed candidate range is teams with RPI Ranks of #7 or better, so the number of candidates for #1 seeds since 2007 has been 13 x 7 = 91.  So when you are looking at a Teams Affected number for #1 seeds, it is that number of teams out of a total of 91. 

  •  The #2 seed candidate range is RPI Ranks of #14 or better, so the number of #2 seed candidates has been 13 x 14, less the 52 teams that got #1 seeds, which amounts to a #2 seed candidate pool of 130 teams. 

  •  The #3 seed candidate range is RPI Ranks of #23 or better, so the number of #3 seed candidates has been 13 x 23, less the 104 teams that got #1 and 2 seeds, which amounts to a #3 seed candidate pool of 195 teams. 

  • The #4 seed candidate range is RPI Ranks of #26 or better, so the number of #4 seed candidates has been 13 x 26, less the 156 teams that got #1 through 3 seeds, which amounts to a #4 seed candidate pool of 182 teams. 

  • The At Large candidate range is RPI Ranks of #57 or better, less the 208 seeded teams, the Automatic Qualifiers in the Top 57, and teams in the Top 57 that failed to meet the 0.500 minimum winning record requirement.  Altogether since 2007, this has amounted to 447 teams. 

 Round Eliminated:  This column shows the NCAA Tournament round this year in which the team was eliminated.  It lets you look at the position the NCAA assigned the team and see how the team did in relation to that assigned position.  For example, Virginia’s #1 seed means that according to the Committee it should have made it at least to the semifinals, but instead it made it only to the 3rd round.  This lets you evaluate how the Committee decisions worked out. (For first round matchups between unseeded teams, I treat the home team as the stronger team according to the Committee.)

With the above explanation, I leave it to you to go up to the #1 Seeds table and see how the Committee decisions match up with its historic pattern.  My comment is that the Committee did not deviate from its historic pattern except when it had to with Duke and Virginia due to their meeting both Yes and No standards and that for those teams, the Committee deviation was small.

#2 Seeds

Here is the #2 seed table:

You can review the table and reach your own conclusions.  My comment is that in giving UCLA a #2 seed, the Committee deviated from its historic pattern.  The deviation was not large but also was not insignificant.  As an alternative, Tennessee would have been an easy #2 seed.

#3 Seeds

Here is the #3 seed table:


My comment is that there is nothing of major import in the Committee decisions.

#4 Seeds

Here is the #4 seed table:


There is only one significant Committee deviation from its historic patterns here, and it is the #4 seed given to BYU.  This was a pretty large deviation.  Ironically, BYU reached the championship game.

 At Large

Here is the At Large table:


My comment is that the only deviation here from Committee historic patterns is in its giving St. Johns an at large position rather than West Virginia, Colorado, Oregon, or Houston.  In looking at St. Johns’ Teams Affected numbers, however, the deviation was extremely small.

Summary and Two Additional Pieces of Information

My evaluation of the Committee decisions in relation to historic Committee patterns suggests that:

1.  The Committee At Large selections were quite consistent with historic patterns and, where it varied with St. Johns, the variation was very small.

2.  The Committee seeds were largely consistent with historic patterns.  Where the Committee varied from historic patterns, most of the variations were small.  The greatest variation was BYU getting seeded, which is ironic since BYU made it to the championship game.

In addition to the above, I have looked at two other aspects of the Committee decisions.

First, I looked at geographic regions based on the states where teams are located.  As it turned out this year, during the season teams from states in the West played roughly 90% of their games against other teams from the West and only 10% against teams from other regions.  This created a big problem for the RPI since 10% is not enough games for the RPI to properly rank teams from a region in relation to teams from other regions.  Because of this, I wanted to see if teams from the West performed differently in the Tournament than the Committee had evaluated them.  The following table addresses this question:


In this table, the Wins Difference column shows, for each region, the difference between (1) the number of games the Committee seeds and bracket placements for unseeded teams indicated teams should win and (2) the number of games teams actually won.  The numbers in this table represent how teams from a region did against teams from other regions, since all within-region games cancel each other out.  Although the numbers in the table are not large, they suggest that the Committee may have over-evaluated teams from the South and under-evaluated teams from the other regions and particularly the West.

Second, I took a similar look, but this time by conference.


This suggests that the Committee may have undervalued the West Coast Conference and, to a much lesser extent, the Big 10 and overvalued the ACC and Pac 12.

Since both of these tables are based on only one year’s results, I do not take them too seriously.  They suggest, however, that it might be worthwhile to do a study that considers more years of NCAA Tournaments, to see if there are any Committee region- or conference-based overvaluation and undervaluation patterns.