March 18, 2014

SportsFilter: The Tuesday Huddle:

A place to discuss the sports stories that aren't making news, share links that aren't quite front-page material, and diagram plays on your hand. Remember to count to five Mississippi before commenting in anger.

posted by huddle to general at 06:00 AM - 14 comments

Super-computer purchase by mystery MLB team to handle in-game analysis.

My favourite comment from Baseball Think Factory about this article:

"Open the bullpen door, HAL."
"I'm sorry, Ron [Washington]. I'm afraid I can't do that."
"What's the problem?"
"I think you know what the problem is just as well as I do."
"What are you talking about, HAL?"
"This mission is too important for me to allow you to jeopardize it."

posted by grum@work at 08:44 AM on March 18, 2014

Sooooo...it's against the rules to draw a dick on the field of play?

posted by NoMich at 11:20 AM on March 18, 2014

Spirit of the game!

posted by Etrigan at 12:44 PM on March 18, 2014

So is this another manifestation of the expression "putting your opponent's dick in the dirt"?

posted by Howard_T at 01:47 PM on March 18, 2014

I'm kind of surprised the supercomputer-for-real-time-analysis thing hasn't occurred earlier; I've been pretty vocal around here (and elsewhere) in the past, wondering when a tech savvy team would basically start doing analysis in-game, even for things like detecting patterns in a pitcher/catcher batter that they could signal to their hitters for.

An aside: I'm not super familiar with all the conceits of ML, but in general it is the case that we behave much more predictably than we imagine. If I recall, the rochambeau simulator that was the MF post I linked in the last paragraph was frighteningly good: it's hard not to imagine that if fed a huge historical set of data- even the last few years- of pitchFX pitches and outcomes, it could find trends that could be hugely impactful. Especially as things go south; the pitcher would get frazzled and shake off the catcher, the catcher and pitcher would try to fool you- but in the process be even more predictable (like in RPS when you keep throwing out one option, because you'd never do paper 4 times in a row!).

You glance over, the manager gives you a set of signals that basically say "It's a 2-1 count, and there's an 82% chance the next pitch is a fastball lower in; sit dead red on that, and try to inside-out it the other way". Maybe it's a curve or slider this time, but over the course of a game/season each player could see meaningful- say even 3-5%- benefit in their batting average/OBP. Which... if a team BA goes up 30 points, they'd basically worst-to-first their entire offense.

posted by hincandenza at 02:33 PM on March 18, 2014

I've got two words regarding Phil Jackson's move to New York: Isaiah Thomas.

posted by phaedon at 03:02 PM on March 18, 2014

I'm kind of surprised the supercomputer-for-real-time-analysis thing hasn't occurred earlier

Perhaps it has. My initial reaction was shock that Cray still sold boxes. I'd think it would be both more cost effective and less of a tip off to other teams if you did this with a cluster of off-the-shelf hardware. Plus you wouldn't be beholden to Cray when the thing has problems.

posted by yerfatma at 03:28 PM on March 18, 2014

Sooooo...it's against the rules to draw a dick on the field of play?

Can they take away the penalty and leave the swelling?

posted by beaverboard at 03:50 PM on March 18, 2014

You glance over, the manager gives you a set of signals that basically say "It's a 2-1 count, and there's an 82% chance the next pitch is a fastball lower in; sit dead red on that, and try to inside-out it the other way". Maybe it's a curve or slider this time, but over the course of a game/season each player could see meaningful- say even 3-5%- benefit in their batting average/OBP. Which... if a team BA goes up 30 points, they'd basically worst-to-first their entire offense.

It doesn't take a super-computer to do that. You could put together an off-the-rack machine for 1% of the cost that could do that.

I suspect that someone on the MLB team is doing a favour for his buddy in the Cray corporation. $500,000 isn't that big of a deal to an MLB team (it's about the cost of a replaceable player for a season), so the money won't really bother them (unless it's Miami). But selling one of these would definitely be a feather-in-the-cap for the Cray guy.

posted by grum@work at 09:21 PM on March 18, 2014

Right, although for all we know part of that $500,000 is the software/contract for expertise in running the system. A typical MLB team is going to have an IT department... but probably not one that has any real expertise in creating, configuring, and running distributed clusters on commodity hardware, or running ML algorithms.

I mean, I figure I could- after quite a few months- build just such a system, having at least some familiarity with Hadoop et al and some open source ML packages. But there'd be a ton of work to customize it to baseball and baseball stats, then test it with the data inputs (even if just cleaning and inputting several years of pitchFX data). I'd go slower working alone, and if I brought in a couple of dev friends to help out, that $500K goes quickly in terms of salary + hardware.

For a deep-pocketed team, getting something that's a turnkey solution for barely more than a single league-minimum salary is still a good deal, and in this case, as with business computing purchases, the CYA and outsourced expertise is worth the extra money spent, compared to hoping you can find someone to bring in-house to do the work for you.

Maybe they're paying for something turnkey enough that their existing IT staff can then just dump in new pitchFX data regularly, as well as learn how to construct certain types of queries that they can then extend to common analysis operations. On the hopefully rare occasions that things go completely cockeyed beyond the existing maintenance contract, you pay Cray a relative pittance for support.

All that aside, I think it's fantastic to see a team (I wouldn't be surprised if it's the Red Sox, as they've been in the statistical and technological forefront for a decade) go to this next level. Which means we'll see other teams get there soon enough, although MLB may still have some restrictions on technology allowed in the dugout. If this provides a huge advantage, we might instead see a rule passed by the owners disallowing real-time analysis and feedback directly to the players (via signals).

posted by hincandenza at 09:57 PM on March 18, 2014

You glance over, the manager gives you a set of signals that basically say "It's a 2-1 count, and there's an 82% chance the next pitch is a fastball lower in; sit dead red on that, and try to inside-out it the other way".

It doesn't take a super-computer to do that.

It could be done with some quick laptops using an OS other than Windows. The thing that might make it less than possible is the requirement for the manager to be something of a contortionist with great hand-eye coordination in order to get that many signals to a batter in the limited time available.

I can see it now: Batter steps out, 3rd base coach goes through multiple motions of head scratching, shirt pulling, belt buckle touching, arm touching, hand clapping, and hopping up and down. Batter indicates a lack of understanding. Coach starts sequence again. Umpire orders batter back into the box. Batter takes his time. Umpire says to tell that bozo up the line to send you an e-mail if he has that much to say. If you don't stand in, I'll call a strike on you, and he can try some more contortions.

To be serious, a good fiber-optic network setup with 5 or 6 work stations, each dedicated to looking at one or two sets of data can very easily return pretty good answers in short order. One of the stations would be located in the dugout or in close proximity thereto, with the rest being located wherever you pleased. When we were doing system test a few years ago we had anywhere from 10 to 12 laptops running the Solaris OS all networked with a tower computer, also running Solaris, serving as the controller. Most of the laptops were remote and unmanned - about 5 miles from our control station, and their data recording was started and stopped from the controller. The rest of them were collecting data for storage and later analysis. The setup worked really well, and once we had gotten a good handle on what we could do, it reduced our testing and data reduction time by a factor of 2.

posted by Howard_T at 10:18 PM on March 18, 2014

I'd actually bet this is something stranger than we're imagining - as has been pointed out you don't need this sort of hardware for the obvious analysis.

The new Trackman-based playtracking system from MLBAM seems to be radar based; maybe someone is trying to process streaming data from multiple radar sources to determine how effective a pitcher is in real time - spin and plane of their curve, actual path of their fastball. Know if your starter has his stuff before you find out the hard way (or the opposite, maybe the other guy can't get his breaking ball working and its time to sit on the fastball).

posted by deflated at 11:04 PM on March 18, 2014

ah, that's a really good point, deflated. I've long thought that analysis of pitch motion etc- even during warmups- would tell you when a pitcher is more likely to be lit up, and an adventurous team would a) do last minute rotation adjustments when their intel says a pitcher will suck, and b) start looking for patterns to correlate success and failure. Temperature? Diet? Humidity? Sleep cycles? Stress? Something leads a pitcher to be more or less effective night over night. It was more obvious for knucklers, when temperature differences or indoor/outdoor stadiums could reliably predict success, but it still applies for all other pitchers, who are not consciously aware why today their splitter is not dropping as much.

And as the article on the new system hints, we might be entering a golden age of defensive metrics, picking up things like average distance to a ball in play (the positioning instincts) along with speed, efficiency of motion, etc, all to lead to better defensive judgment than the "pasta diving Jeter" gut reactions of the past.

posted by hincandenza at 01:54 AM on March 19, 2014

The new Trackman-based playtracking system from MLBAM seems to be radar based

It would take some sophisticated radar to look at a baseball in flight and determine its track, velocity, and rotational characteristics. True enough, there are radar systems that can track bullets and recognize them as a threat within microseconds, but these are costly and most are available for military use. The problem is one of target resolution. The higher the pulse repetition frequency (PRF) of a system, the more resolution you gain. In order to put enough energy on the target to get a useful return, you need a greater pulse width (PW). Thus, PRF is limited by the PW required. There are also limits caused by keeping the leading edge of the pulse as sharp as possible, and this becomes more difficult as PRF increases. This is mitigated somewhat by using higher frequency radars, and higher frequencies also lend themselves to smaller equipment. One could use a continuous wave (CW) transmitter (on all the time), and apply the pulse train to the receiver. Switching becomes a bit easier. There's a problem with CW radars in the defense industry in that they are rather susceptible to homing missiles, but I don't think too many MLB teams will resort to that.

Laser tracking systems will work a lot better than radar, but they need to be pointed in order to identify the target to be tracked. The laser beam is very narrow, so if the object it is looking for is not within the beam to start with, chances are that no track will be established. A combination system could be used, whereby the radar finds the object first and slews the laser onto it, but I think you realize the complexity here.

It all comes down to a matter of cost vs results. I have seen the displays on various places that show the pitch velocity, amount of horizontal movement and amount of vertical movement. I seem to remember somewhere seeing a supposed rotational speed as well, but I'm not too sure of this, nor am I sure of the technique that was used to establish the rotational speed measurement. The equipment is certainly available to produce the desired measurements, but is the cost prohibitive enough to keep it away from general use? Weird RF engineer that I am, or at least was, I would love to see this stuff happen just so I could look at the equipment and drool a lot.

posted by Howard_T at 03:25 PM on March 19, 2014

You're not logged in. Please log in or register.