google.com, pub-3283090343984743, DIRECT, f08c47fec0942fa0 What Statcast’s New Bat Tracking Data Does and Doesn’t Tell Us
× Backyard GrillingWeekend WarriorsAdvice from DadBeard GroomingTV Shows for Guys4x4 Off-Road CarsMens FashionSports NewsAncient Archeology World NewsPrivacy PolicyTerms And Conditions
Subscribe To Our Newsletter

What Statcast’s New Bat Tracking Data Does and Doesn’t Tell Us


limitations
Orlando Ramirez-USA TODAY Sports

The bat tracking era is here, and nothing will ever be the same again. Wait, no, that’s not right. Baseball is going to continue pretty much exactly as it was. Pitchers will throw the ball, hitters will swing at it, and then people will run around the field either trying to catch it or touch bases. But baseball analysis is going to start looking different, because we analysts have new shiny toys and a plethora of new ideas to test out. That’s very exciting, and also possibly a little overwhelming. So today, I thought I’d take you on a tour of what the high-level summary numbers do and don’t say about hitting, as well as stump for more granular analysis. I’m sure I’m not alone on either of those points, but still, it’s good to say it out loud. So let’s talk about average swing speed, average swing length, squared-up rate and blast rate, shall we?

Swing harder, do better, right? Well, maybe. That makes sense broadly, and it particularly makes sense when you look at some of the names dotting the top of the swing speed leaderboard. Juan Soto, Aaron Judge, Yordan Alvarez, William Contreras, Mike Trout, Shohei Ohtani, Gunnar Henderson — there are plenty of hitters at the top of the swing speed leaderboard who are unquestionably excellent.

Pop down to the bottom, though, you could make a pretty great offensive team out of the soft swingers too: Luis Arraez, Steven Kwan, Justin Turner, Marcus Semien, Isaac Paredes, Will Smith, Jose Altuve. In aggregate, there simply isn’t much correlation between average swing speed and offensive production, as measured by wRC+. More specifically, there’s a 0.11 correlation coefficient between swing speed and wRC+. That means, broadly speaking, that variation in swing speed explains only 1% of variation in wRC+ (0.012 r-squared). A quick note: I used an 80 PA cutoff for this and all subsequent calculations in the article, just so we’re comparing apples to apples.

That’s obvious if you stop and think about it. Baseball isn’t a fast swing competition. Swinging fast helps, obviously. But Giancarlo Stanton isn’t the best hitter in baseball history despite almost certainly being the hardest swinger, and Luis Arraez isn’t the worst hitter in baseball, or even close to it. Swing speed is merely one data point that describes a little bit of what a hitter does at the plate.

Swing length is another interesting new data point. Just like swing speed, you can draw some obvious conclusions without even looking through the data. A longer swing means more strikeouts, right? Well…

Swing length and strikeout rate have a .277 correlation coefficient and a 0.077 r-squared. That’s not bad. You can do better by looking at swing length against whiffs per swing, which strips out strike zone judgment (not what we’re looking for here). That gets us an r-squared of .152, which is better on a relative basis, but still not huge – it’s about the same as the year-over-year correlation of BABIP, which we know is quite noisy. Both average swing speed and squared-up rate (in the opposite direction) have stronger correlations to whiffs per swing, in fact. But it’s generally true that slower, shorter swings that prioritize getting the head of the bat on the ball result in fewer strikeouts.

Swing length is also heavily correlated to pull rate. If you’re trying to hit the ball in front of the plate to pull the ball, your bat will naturally travel farther, even with the exact same swing mechanics, than if you meet the ball earlier in your swing. I don’t have a great way of controlling for this yet, but it’s feasible that by controlling for batted ball tendency, you could find even stronger relationships between swing length and contact rate. It’s not to say that a long swing is bad, just that it comes with tradeoffs.

Let’s get back to bat speed. It’s clear from an initial inspection that you can’t say a ton about a hitter’s overall performance just by looking at how fast they swing. There are still some things to learn here, though. For example: The harder your average swing, the more damage you do on contact in general. I’ve listed enough correlation coefficients to put even the most stat-obsessed readers to sleep, so at this point I’m just going to show a grid of all of them and be done with it:

A Big Pile of Bat Tracking Correlation Coefficients
Statistic wRC+ K% Whiff/Swing BABIP wOBACON xwOBACON
Average Swing Speed 0.110 0.406 0.459 -0.007 0.351 0.510
Hard Swing Rate 0.164 0.301 0.376 0.016 0.357 0.511
Swing Length -0.013 0.277 0.390 -0.087 0.148 0.186
Squared-Up% 0.142 -0.667 -0.664 -0.019 -0.184 -0.167
Blast% 0.361 0.007 0.043 0.111 0.381 0.573

Why is swing speed more closely correlated to xwOBACON than wOBACON? Two reasons. First, we’re dealing with small samples across the board and production on contact is noisy, which means that a hitter with a ton of seeing-eye singles on mishits can mess up the data. Second, wOBA cares a lot about the horizontal angle of your hits, but neither xwOBA nor swing speed does. Isaac Paredes doesn’t swing hard, and doesn’t hit the ball particularly hard as a result. His xwOBACON and swing speed agree. But he’s dumping those batted balls over the left field fence for home runs, and wOBA knows that.

I suspect that raw bat speed data is going to end up a lot like other raw pitch and exit velocity data: interesting but incomplete. I didn’t need a leaderboard to tell me that Giancarlo Stanton swings harder than any other player in baseball, because I have seen Giancarlo Stanton swing before. It’s really cool that we’re now measuring this, and I’m sure you’ll hear it on broadcasts all the time going forward, but the link to production is tenuous enough that I don’t think it’s a great statistic all by itself.

Now onto the slightly more complicated statistics: squared-up rate and blast rate, which measure hitters’ ability to hit the ball right on the nose, and do so with high swing speed in the case of blasts. Squared-up rate seems like an obviously great metric right off the bat. When you hit the ball right on the sweet spot, you’d expect a lot more line drives. After all, Luis Arraez is the king of soft line drives and also the king of squared-up rate. Just one problem: there’s no correlation between squared-up rate and line drive rate, at least in 2024 data.

Now, maybe that’s just a sample size issue. Line drive rate is noisy even at a seasonal level, never mind after a month and a half of play. If you zoom in closer, the effect seems real. Batted balls that Statcast categorizes as squared up carry a 28% line drive rate so far this year; balls that aren’t squared up have a 19.1% mark. The issue here is that for a hitter who increases their squared-up rate by five percentage points, we’re talking about an increase of 0.4 percentage points of line drive rate. Half the players in baseball have a squared-up rate between 22% and 29.4%. The differences here are small. Beyond that, squared-up rate is nearly uncorrelated to BABIP, wOBACON, and xwOBACON. Should we just give up on squared-up rate?

I don’t think so, despite those uninspiring numbers. Squared-up rate and average swing speed are quite correlated themselves, and in the logical way. The harder you swing, generally speaking, the less frequently you hit the ball square. That’s why Juan Soto’s combination of fearsome swings and great contact is so impressive. If you run both swing speed and squared-up rate through a multivariate linear regression against measures of production on contact, they’re both significant. In other words, swinging harder and squaring the ball up more frequently both increase production.

Without digging too deep into the statistical minutiae, these two statistics are so correlated that I don’t have a lot of confidence in that regression. But even after you correct for that multicollinearity, there’s a clear relationship: swing the bat harder or make optimal contact more frequently, and you’ll tend to do better on contact. But even then, those two things don’t explain all or even most of a hitter’s production on contact. There’s plenty more than just swinging hard and catching the ball on the barrel of the bat. That’s a milquetoast conclusion, sure, but it’s still a useful one to me; it’s good to make sure that two plus two is four before you start on differential equations.

The same is generally true of blast rate, the rate at which a hitter squares the ball up while swinging hard. It does a bit better because it’s capturing what I was talking about up above; harder swings square the ball up less frequently in general, so you want both when you’re looking for production. But again, plenty of other variables go into this as well. As you might expect, swinging your bat hard and squaring the ball up are at their best when it comes to producing solid contact at positive launch angles.

That sounds a lot like launch angle and exit velocity, and we know that those inputs do a good job, but not a perfect job, of explaining production. As best as I can tell, that’s just a fundamental limitation of statistics like this that isolate a small portion of what’s involved in hitting a baseball. There are plenty of other things you can do to generate value, and some of them might even decrease your swing speed or squared-up rate, which will forever frustrate analysis.

Okay, so we know that neither raw swing speed nor squared-up rate do a great job of predicting overall production. What can we learn from this new bat tracking data, then? First of all, it’s just incredibly cool that it exists. Hitters can and should behave differently based on their swing speed, and now we can quantify that more than ever before.

More importantly, the neat part of this data is largely in granular interpretation. I find it fascinating that in-zone swings at secondary pitches are meaningfully faster, on average, than in-zone swings at fastballs. Hitters swing faster at in-zone fastballs when they’re ahead in the count than when they’re behind in the count; that makes intuitive sense, but we have the actual evidence now. Before, you could say something like, “Hitters can sit on a fastball thrown to a particular area when they’re ahead in the count and unload if they get it, but they have to react when they’re defending the zone,” but now you can prove it. They do much better on those early-count swings, which we already knew; now we just know why with more certainty.

Here’s a fun one: Early-count secondary pitches get squared up less frequently than two-strike ones, but at higher bat speeds. There’s a strong intuitive pattern here. Hitters slow their swing and prioritize contact when they get behind in the count. But even though they’re squaring the ball up slightly more frequently, that squared-up contact is less productive – they’re swinging more slowly, after all.

When we get more months and years of bat tracking data, the applications will only increase. Is a hitter cold because that’s how hitters get sometimes, or is he physically compromised? Is that new shorter swing making up for its lower bat speed with better contact numbers? Has that aging veteran remade his swing to prioritize contact now that his bat speed is flagging? Mike Petriello is already talking about new applications, too: attack angle and miss distance will put swing speed data in much better context, and I’m excited to get them in the fold.

There are surely some cool applications on the pitching side, too; obviously pitchers don’t do a ton to affect bat speed, but “avoiding the fat part of the bat” has long been the holy grail of contact managers. Only 11.1% of swings at Hunter Harvey’s fastball square the ball up, while 34.4% of swings at Adrian Houser’s fastball do. That sounds like an amazing discovery, but we’ll need to see how stable these statistics are to really know for sure.

When we find ways to measure something previously unmeasurable, it’s tempting to ascribe great untapped analytical power to those things. And to be clear, I think that there are going to be some cool advances in public-side analysis that wouldn’t have been possible without this data. One thing that almost certainly won’t advance public-side analysis, though, is asking, “Oh hey, who swings the hardest?” and leaving it at that.

So go out and have fun looking at this new information, and reading the interpretations and musings of people like me who are trying to find some new stories to tell with it. But be aware of the inherent limitations. Bat tracking isn’t enough to tell you who’s good and who isn’t, and it doesn’t have to be. We already have a bunch of ways of measuring that. Now, we’re just expanding our horizons a bit more.

Source

https://blogs.fangraphs.com/what-statcasts-new-bat-tracking-data-does-and-doesnt-tell-us/