My favorite book that I've read in the last year is The Drunkard's Walk. It's essentially a history/explantion about how mathematicians have tried to observe and measure tendencies in our lives with statistics. The discussion that I particuallarly enjoyed was the discussion of the Gambler's Fallacy because, as Amos and Tversky first discovered, humans aren't wired to intuit what the law of averages really gets at—that short run streaks don't change the balance of long run probabilities (I hope I did that justice).
So I was particularly excited when BtB suggested a way to put this concept use in the early part of the season.
Sky provided a spread sheet that allowed you to input your team's original projection total from your projection source and then the actual record. It then spits out the newly expected win total for the season taking into account the true talent level and the available sample space.
Here's what happend when I used the Community Projection Project:
| pW | pL | aW | aL | nW | nL | delta | |
| Astros | 83 | 79 | 4 | 8 | 80.9 | 81.1 | -2.1 |
p=predicted, a=actual, n=new prediction
The caveat I'd like to bring to the idea of trying to account for the Gambler's fallacy is that the Astros offense, by it's very construction, is itself streaky. So while there might not be predictive power in a short run streak, I'm not sure the fallacy entirely applicable in our case, but maybe someone with a better understanding of the concept can chime in.
0 recs | 7 comments
I don't think that is correct.
Technically, the odds of a baseball team winning a particular game are not random. Among other things it will depend on the opposiing team, the opposing pitcher, and whether the game is home or away. Also, gambler’s fallacy isn’t totally applicable, because the probabilities at particular points in the season are not independent. For example, each game played against the Cubs reduces the proportion of future games which will be played against the Cubs, which changes the probabilities for wins/losses in future games. (This is like removing cards from a deck after the cards are selected.) The same can be said about home/road games. To model the effects more completely would seem to be more complex than what is done at BIB. And, even then, if I recall from our discussions of BP’s playoff odds calculations, there is disagreement over how to handle strength of schedule.
I agree with your comment that the Astros’ team construction results in a streaky team. I don’t have an answer on how to address that. Since we have no idea about the distribution of streaks across the season, it strikes me that a bigger sample size of games played would give us more comfort….but really I’m not sure.
clack - April 20, 2009
Come on.
I think that today’s blog was extremely well written, but there is something very cold about trying to apply mathematics to everyday life, especially baseball. The more we quantify life, the less special and important it is. Give me chance, mystery, and hope. There is no reason the Astros couldn’t pull off another great season. At the end of the day, you can’t outthink life, and especially baseball, and that is what makes it so much fun to watch.
Lets go ’Stros!
teddyspaghetti - April 20, 2009
That's very true
But I think it’s at least interesting to consider what cold mathematics says about our fortunes. I think it’s especially relevant to consider the immediate future through some kind of objective lens when the front office is hell bent on making every possible allowance for the present at the expense of the future.
The Astros could pull it off, as clack pointed out, this idea doesn’t have the best applicability to baseball, and I hope they do. Otherwise, this will just be a long and frustrating season.
Stephen Higdon - April 20, 2009
my comment really didn't address what the Astros can pull off in the future...
but, like you, I hope they can improve on their current record and be competitive. But my point is that we don’t know for sure how the current record affects the community projection. That projection, like most such projections, is not at a detailed level which can be applied to individual games. The projection is based on the season long “population” of games, and therefore can encompass all the streaks averaging out. However, it would be stretching the projection methodology to assume that it predicts a constant W/L percentage throughout the year.
clack - April 20, 2009
I didn't mean for it to read that you were implying anything
I just meant to tie you into the discussion that the Gambler’s fallacy has isn’t perfectly applicable to baseball.
Stephen Higdon - April 20, 2009
How does this approach say that the Astros couldn't have another great season?
It’s talking about the mean expectation. Anything, on either side of the mean, is likely, even if we knew 100% that the Astros are a .500 team, a .450 team, a .600 team, or whatever.
Sky Kalkman - April 20, 2009
The Gambler's Fallacy applies only to
events governed by a known probability distribution, such as a fairly balanced coin. When trying to project the outcomes of things like team performances, we do not know the actual probability distribution of their wins – it is that very thing we are trying to guess. Therefore, since we start with a guess, we should modify our guess as actual data comes in.
There is a useful mathematical technique for doing this and it is both simple and effective – exponential smoothing. Using that technique, the initial projections are modified every day – for an experiment with 162 trials (games) the normal modification is 1/2 X 1/number of trials (games) or 1/81 or .0123 per game.
As I write this, the Astros are .357 (5-9). Had you thought the Astros were going to be a .450 team at the beginning of the year, the current projection would be .435. Had your original guess been .500, the current guess would be .477, and had your original projection been .550, the current guess would be .519. Interestingly enough, after about 130 games, it would not matter what your original projection was as the current projection would be largely unaffected by it (which makes logical sense) as the evidence from the 130 actual trials overwhelms the original guess.
All of this is exclusive, as clack pointed out, of other variables such as strength of schedule, home-away bias, ballpark factors, weather, rotation, etc. Fun, eh?
bwhite2323 - April 22, 2009
You must Login with your SB Nation account and be a member of The Crawfish Boxes to post a comment.