Baseball By The Dumbers: Part I: What happens when people with too much time try to invent statistics.
When people try to present baseball statistics to me, I reflexively launch into a tedious argument, the crux of which is that the nuances of baseball do not fit tidily into a numerical formula. This involuntary lurching comes in part from my own personal susceptibility to being fooled by math. Plus, math geeks used to beat me up and steal my baseball cards.
The hypocrisy in this lies in the fact that I love to play with statistics. I am very careful, though, that when presenting to them to any audience I make sure it is with the following caveats: 1) I am not that good at math or reasoning; 2) good math does not equal good reasoning; and 3) I like Derek Jeter. This last part is necessary, because I also reflexively do not like statistics that prove conclusively that Derek Jeter is overrated. And stats I "invent" have a tendency to favor the old boy. This is NOT my intent -- I create statistics to draw conclusions, not to prove them. It just often turns out in my favor. I credit this to his brilliance more than mine.
With all of this in mind, I would like to present for discussion some statistics I think I may have invented. I say, "I think," because it's quite likely, in the vast world of Sabermetrics, that somebody has already come up with these ideas, but honestly I have never seen this done before. (It's quite possible the statistics are so meaningless that nobody but me would even bother to come up with them -- I can't argue with that, but I think they're worth consideration.)
The genesis of this statistic was a conversation I had with my dad. He and I are both Yankee fans, and we were discussing the "clutchness" of certain players -- he was trying to convince me that John Flaherty was more clutch than Jorge Posada. As we discussed how we even begin to define "clutch", it occurred to me that the term was best described somehow by runs batted in. (This, of course, is almost certainly a false premise, but if you're going to use a single, easily attainable stat, that's as good as any, I think.) I know it has been argued on this site, and my dad and I touched this as well, that a clutch RBI doesn't necessarily occur toward the end of the game -- if the final score is 1-0, and the one run was scored in the third inning, that's still pretty clutch. With that in mind, I did some thinkin' and some legwork. Below is the text of an e-mail I sent my dad:
I did a search to see what Yankees had "significant" RBIs that made the difference in the final score of individual games (e.g. if a Yankee had 2 RBIs in a game won by 2 or fewer runs, I counted their RBIs as "significant"). Obviously, this study applies generally to close ballgames (the only blowout that counted was the Yanks' 12-3 victory in which A-Rod had 10 RBIs - without his RBIs, the Yanks would have lost). Bear in mind that if four Yankees each drove in one run in a 4-3 ballgame, then each of them would get credit for a significant RBI for that game.
This statistic, between my dad, my brother, and I, became known as "The Game Maker." I wrote this e-mail on June 10 of last year, and at that point, here is a sampling of Game Maker totals for a variety of players (note: the Game Maker, as it turned out, is a count of games, not RBIs -- it is the total number of games in which a player had "significant" RBIs, not a count of the RBIs themselves):
Nick Johnson: 8 games
Paul Konerko: 8 games
Miguel Tejada: 7 games
Garrett Anderson: 7 games
Carlos Lee: 6 games
Derek Lee: 6 games
Pat Burrell: 5 games
Cliff Floyd: 5 games
Hideki Matsui: 5 games
Phil Nevin: 4 games
David Ortiz: 4 games
Derek Jeter: 3 games
A-Rod: 3 games
Gary Sheffield: 3 games
Bernie Williams: 3 games
David Wright: 2 games
Tino Martinez: 2 games
Tony Womack: 2 games
Jorge Posada: 2 games
Ruben Sierra: 2 games
Mark Texeira: 1 game
Robinson Cano: 1 game
Jason Giambi: 0 games
I have no idea how significant these numbers may be when measuring as early as June 10, but I thought it was interesting. At the end of the season, I did another count, but only for Yankees (sorry, but that was what served my purposes at the time). Here are those totals:
A-Rod: 16 Game Makers
Matsui: 15
Sheffield: 14
Jeter: 13
Giambi: 13
Posada: 11
Bernie: 11
Cano: 8
Tino: 5
Sierra: 4
Womack: 4
Flaherty: 2
Now, it occurred to me that, simply by virtue of opportunity, middle of the order guys would have an advantage (as they do with RBIs generally). So, I attempted to calculate some sort of “clutchness” indicator that evened the field. Hence…
I am sure the more mathematically inclined SpoFi readers will have opinions on how best to level the Game Maker field. I figured I needed some denominator. I contemplated total games played and total number of games in which the player drove in runs, but that required too much extra, like, counting and stuff. I already had total RBIs in front of me, so I went with that. Good statistic? You tell me. Here are the final 2005 season ratios for the Yankees:
Womack: .267
Jeter: .186
Flaherty: .182
Bernie: .172
Posada: .155
Giambi: .149
Matsui: .129
Cano: .129
A-Rod: .123
Sierra: .138
Sheffield: .114
Tino: .102
As Babe Ruth as my witness, I did not pick RBIs to enhance Jeter’s numbers. I swear. It just worked out this way – by my Game Maker Ratio, Jeter was the most clutch regular player on the Yankees in 2005. But not as good as Womack. And Flaherty was better than Posada, so my dad won that argument.
To the hardcore numbers guys, this may seem a pretty elementary exercise (if it’s worth any consideration at all), but I found it, well, amusing. With the large core of baseball fans on SportsFilter, I figured it was at least worth brief discussion. And, if there is anything to it, perhaps it is worth massaging and keeping an eye on as this season progresses and we continue the discussion on the meaning of "clutch.”
posted by BullpenPro to commentary at 04:23 PM - 0 comments