TLDR: SpreadsheetAg "Stat Sim" is right more often than not and provides great context for where/how the game will likely be won or lost.
There have been a few people asking for an After Action Review of SpreadsheetAg's much appreciated work. After going through 127(!!!) pages of threads posted by SpreadsheetAg I uncovered some of the storied history of SpreadsheetAg.
In 2006 SpreadsheetAg started posting statistical breakdown threads, innovative at the time, but without a game prediction. Starting 2009 SpreadsheetAg gathered board predictions, calculated the mean/median/mode of those predictions, and identified the closest post game prediction. I do not see SpreadsheetAg's own predictions within the google documents. Between 2009-2012 it appears that SpreadsheetAg is making score predictions based on the comments on the threads. Unfortunately the image hosting sites are no longer supporting the images so data cannot be gathered from these games, unless SpreadsheetAg would like to send me his old work . In 2013 "Stat Simulation" appears in the thread titles for the first time and PDFs are available to pull data from! However, there are still many threads with images lost to time from 2015-2021. Finally, posts from 2022 to present are all accounted for, although not all games had stat sims.
The results from today's dive into TexAgs history gave me 44 data points to work with, 39 of which are predictions with a spreadsheet - which I know is what everyone is curious about anyway.
Conclusions: I don't know - football is messy and these stats seem to confirm my gut instinct that SpreadsheetAg's Stat Sims slightly favor the good guys (I wouldn't have it any other way) but are generally accurate. More important than the point and yard total estimates I like to see the differences between national rankings of positional groups, offensive categories and defensive categories. Helps going into the game to know what to focus on and that value cannot be quantified.
If there is any other calculation you would like me to run on this data set just post it and I'll do my best to keep up with it.
Finally - thanks and Gig 'Em to SpreadsheetAg for your hard work!
Here is the data:
There have been a few people asking for an After Action Review of SpreadsheetAg's much appreciated work. After going through 127(!!!) pages of threads posted by SpreadsheetAg I uncovered some of the storied history of SpreadsheetAg.
In 2006 SpreadsheetAg started posting statistical breakdown threads, innovative at the time, but without a game prediction. Starting 2009 SpreadsheetAg gathered board predictions, calculated the mean/median/mode of those predictions, and identified the closest post game prediction. I do not see SpreadsheetAg's own predictions within the google documents. Between 2009-2012 it appears that SpreadsheetAg is making score predictions based on the comments on the threads. Unfortunately the image hosting sites are no longer supporting the images so data cannot be gathered from these games, unless SpreadsheetAg would like to send me his old work . In 2013 "Stat Simulation" appears in the thread titles for the first time and PDFs are available to pull data from! However, there are still many threads with images lost to time from 2015-2021. Finally, posts from 2022 to present are all accounted for, although not all games had stat sims.
The results from today's dive into TexAgs history gave me 44 data points to work with, 39 of which are predictions with a spreadsheet - which I know is what everyone is curious about anyway.
- Out of 39 games the result (W or L) was correctly predicted at a rate of 71.79%
- Removing the 8 cupcakes (which had a 100% W rate) a correct prediction was made 67.74% of the time
- Non-cupcake (NCC) home games (12) rate of 75%
- NCC neutral games (6) rate of 66.67%
- NCC away games (13) rate of 53.85%
- NCC predicting a loss (6) rate of 83.33% (If SpreadsheetAg says we are going to lose, buckle up)
- NCC predicting a win (25) rate of 60%
- Mean spread (A&M points - Opp points) all games: Formula one 7.13, Formula two 6.16, Actual 2.45
- Median spread all games: Formula one 5, Formula two 4, Actual 3
- Mean spread predicted losses: Formula one -5, Formula two -3.66, Actual -4.16
- Median spread predicted losses: Formula one -6, Formula two -3.5, Actual -6.5
- Mean spread predicted wins: Formula one 10.04, Formula two 8.52, Actual 4.04
- Median spread predicted wins: Formula one 8, Formula two 7, Actual 4
- 6 of 10 games losses that were predicted as wins were one score losses
- The actual mean is skewed higher by last year's SC game...
- Mean spread predicted wins that are actual losses: Formula one 6.1, Formula two 5.3, Actual -9.5 (removing the SC game brings this down close to -7.5)
Conclusions: I don't know - football is messy and these stats seem to confirm my gut instinct that SpreadsheetAg's Stat Sims slightly favor the good guys (I wouldn't have it any other way) but are generally accurate. More important than the point and yard total estimates I like to see the differences between national rankings of positional groups, offensive categories and defensive categories. Helps going into the game to know what to focus on and that value cannot be quantified.
If there is any other calculation you would like me to run on this data set just post it and I'll do my best to keep up with it.
Finally - thanks and Gig 'Em to SpreadsheetAg for your hard work!
Here is the data:
