SE_CT Bucknut's picture

SE_CT Bucknut

Member since 26 August 2014 | Blog

Recent Activity

Comment 17 Nov 2015

Easy Choice:  PSU.

PSU beats scum, then scUM has 2 losses.  Gives us an open road to the conference championship game if we beat Sparty, and indirectly keeps our playoff hopes alive in the event OSU were to somehow lose in the little house.

I have a feeling some crazy stuff is going to happen over the next couple of weeks and during the CCGs so we could see some large movement.  Some of the crazy stuff:  I could see Clemson losing to UNC, ND losing to Stanford, Ok St losing to Oklahoma (heck the entire B12 could have 2 losses), so OSU making the playoffs after dropping a game at Ann Arbor is not impossible (just unlikely), but they must win the CCG to even be in the conversation.  PSU beating scUM helps meet that end as an insurance policy.

Comment 17 Nov 2015

Std. Dev. can't be per rush with those means. Leonard Fournette's average is 6.24 and minimum is -5. The first standard deviation only encapsulates 65% of the data or 32.5 on either side of the mean. (Numbers may be slightly off as I haven't done too much statistical analysis in the past few years). Thus, the first standard deviation below the mean with a standard deviation of 11.15 would go all the way to -4.91. I doubt that 17.5% of Leonard Fournette's runs occur between -5 and -4.91 yards, and it would be statistically impossible for the second standard deviation to go all the way down to 10 below minimum with this sample size.

You have a concept error here...  The distribution is not built around the existing data, the distribution predicts the data.  The point that you make (regarding 17.5% between -4.91 and -5) underscores why this is the wrong model.  The STDEV is per rush, but it is a poor model of the actual predicted outcomes.  The second STDEV would go all the way down to -15 which would indicate a very low probability of runs below -15 yards (~3.5%).  The fact that Fournette doesn't have a single carry outside of 2 STDEVs doesn't mean the model is flawed.  The model is flawed because the actual data is likely much more clustered near the mean than this distribution suggests.  I would guess that if you had to use a STD Normal to model this, it would be a very steep Normal with most values within a couple yards of the mean.  The problem is the large outliers (think Zeke's 3TDs vs. IU) are distorting the STD DEV significantly and making it much too large.  That is why I would remove the outliers and account for that data another way.

Comment 17 Nov 2015

Variance is square of the distance that an item statistically is from the mean.  So you calculate it by taking the mean, then taking each of the individual values and subtracting it from the mean then squaring the result.  You add this up for each of the individual values, then divide the entire sum by the total number of data points.

In this case Zeke has a mean of 142.5 yards per game, so it would look like this (for the first few games):  (122-142.5)^2 + (101-142.5)^2 + (108-142.5)^2.... you sum all of those together for the entire season and you get a really big number.  Then you divide that number by the number of games (10) and you get a Variance of 2744.944 (easiest to do this in excel).  The standard deviation is the square root of the variance and is much more statistically useful.  In this case the standard deviation is 52.4.  So it is pretty clear that both of those stats (variance and STDEV) are per rush.

This means that if we assume that Zeke's performances are "Normally distributed" (standard bell curve)...  that we can expect Zeke's outcomes to be within one standard deviation of the mean (+ or -1) 68% of the time.  So there is a 68% probability that zeke will rush between 90 and 195 yards.  There are serious flaws with the assumption, and therefore this would be dumb.  One major flaw is the Normal distribution... there are different variables in play (defense of the team he is playing against) so the normal is not necessarily appropriate.  Additionally, there is still not a statistically relevant sample size to cause convergence, would like to see more like 30 data points.

The per rush data (STD Dev and Var) are probably a little more relevant as there certainly is enough data to be statistically relevant, but again, the Normal assumption is not great.  All we can get out of this data using the Normal Distribution is that 68% of the time Zeke's runs will be between -4 and 18 yards.... I would guess (not going to analyze the data) that more like 95% of his runs fall in that window.  If I were analyzing this data officially, I would subtract the outliers from the data set (runs outside of one STDEV or some other parameter) that are badly skewing the STDEV or Var.  I would account for those using some new parameter called % explosive plays or something like that, as a statement of the relative "explosive ability of the running backs."

You could also attempt to use some more "exotic" distributions that tend to reduce the effect of head/tail (outlier) probability events like the Cauchy.  Standing by to answer any other math questions.

Comment 31 Aug 2014

I think the author is crazy for not liking the DL play.  Washington and Bennett were awesome in that game and Bosa was sound too.  I think Darron Lee is going to be a star, and not just because he picked up a fumble.  He played pretty well and flashed all over the field.  He made some mistakes, but he made some great plays too.

Triple Option is a tough gig and Navy runs it beautifully.  Every player on the field for Navy is unselfish and blocks in a way that I wish our guys did.  When you play Navy you are really playing 11 guys on every play.  I wouldn't be surprised to see Navy win 10-11 games this year.  Navy is a good team, and they execute their offense as well or better than anyone else.

Most Impressive:  Washington.  He looked like a top 5 NFL draft pick this week, beating doubles into the backfield and blowing plays up.  He has the size and frame that Mike Bennett is missing.

Most Disappointing:  Elliot.  Granted it was a small sample size, but I was not thrilled and I found myself bummed when he got the ball, as I was wishing those touches would go to Wilson and Samuel.  I think (based on small sample size) that Samuel could end up as the featured back by the end of the year.  He is small, but he runs hard for his size.

Comment 29 Aug 2014

I was in GTMO from 97-00, small world.  Worked at NSGA GTMO.