LD % and BABIP
- skmsw
- Perennial All-Star
- Posts: 6341
- Joined: April 18 06, 7:12 pm
- Location: The Hub
BABIP = batting average on balls in play
A "ball in play" is a fair ball that when first hit has the opportunity to become either a hit or an out -- that is, no strikeouts, no walks, no home runs, no HBPs. Across the major leagues, the "average" BABIP is close to .300, but interpreting variances from this "average" must be done very cautiously -- different hitters have very different BABIPs that are normal for them.
LD = percentage of balls put in play that are hit for line drives
There is an obvious subjective element to this, and a slight inaccuracy; "line drives" are determined by spotters who watch games and enter in play by play data for the different agencies who track them. So what one spotter considers a line drive, another might consider a fly out (subjectivity). And a softly-hit liner to the shortstop counts as a line drive but has the properties of a pop-out (inaccurate). Including vector data (speed, trajectory, distance, direction) decreases the subjectivity quite a bit and allows for more penetrating analysis, but is not widely available. In the major leagues, 18-21% of a good hitter's balls in play are typically line drives.
In general -- very general --
10% of fly balls in play are hits
33% of ground balls are hits
60-70% of line drives are hits
A "ball in play" is a fair ball that when first hit has the opportunity to become either a hit or an out -- that is, no strikeouts, no walks, no home runs, no HBPs. Across the major leagues, the "average" BABIP is close to .300, but interpreting variances from this "average" must be done very cautiously -- different hitters have very different BABIPs that are normal for them.
LD = percentage of balls put in play that are hit for line drives
There is an obvious subjective element to this, and a slight inaccuracy; "line drives" are determined by spotters who watch games and enter in play by play data for the different agencies who track them. So what one spotter considers a line drive, another might consider a fly out (subjectivity). And a softly-hit liner to the shortstop counts as a line drive but has the properties of a pop-out (inaccurate). Including vector data (speed, trajectory, distance, direction) decreases the subjectivity quite a bit and allows for more penetrating analysis, but is not widely available. In the major leagues, 18-21% of a good hitter's balls in play are typically line drives.
In general -- very general --
10% of fly balls in play are hits
33% of ground balls are hits
60-70% of line drives are hits
-
greenback44
- Hall Of Famer
- Posts: 11664
- Joined: June 26 06, 8:54 pm
- Location: In a Small Town with Jack and Diane
STATS and BIS (and others) define line drives on a case-by-case basis. The arbitrariness is enough of a problem that one of em (I forget which) introduced a "fliner" category.Kyle wrote:Isn't a "line drive" arbitrary? How do you determine what is a line drive and what isn't? Like how is it technically defined when finding the LD percentage?
-
jim
- Red Lobster for the seafood lover in you
- Posts: 50393
- Joined: May 1 06, 2:41 pm
"fliner" is a John DeWan invention, so that would be STATS.greenback44 wrote:STATS and BIS (and others) define line drives on a case-by-case basis. The arbitrariness is enough of a problem that one of em (I forget which) introduced a "fliner" category.Kyle wrote:Isn't a "line drive" arbitrary? How do you determine what is a line drive and what isn't? Like how is it technically defined when finding the LD percentage?
- Hungary Jack
- Mother Earth
- Posts: 19537
- Joined: July 24 06, 6:03 am
- Location: In Cognito
This is the key. I would imagine that there is a set of correlations between distance traveled and duration of flight that distinguishes line drives from fly outs, popups, fliners, etc. quite definitively.skmsw wrote:LD = percentage of balls put in play that are hit for line drives
There is an obvious subjective element to this, and a slight inaccuracy; "line drives" are determined by spotters who watch games and enter in play by play data for the different agencies who track them. So what one spotter considers a line drive, another might consider a fly out (subjectivity). And a softly-hit liner to the shortstop counts as a line drive but has the properties of a pop-out (inaccurate). Including vector data (speed, trajectory, distance, direction) decreases the subjectivity quite a bit and allows for more penetrating analysis, but is not widely available. In the major leagues, 18-21% of a good hitter's balls in play are typically line drives.
- Asmodai
- Veteran Player
- Posts: 1121
- Joined: February 9 07, 7:37 pm
Will do. This first graph has hitters LD% on the x-axis and hitters BABIP on the y-axis. Each data point represents one player-season from 2004, 2005 or 2006 who qualified for the batting title. This gave us 443 data points.Hungary Jack wrote:It would be cool if some of our stat gurus could plot BABIP vs. LD% and determine correlation and R-squared.

There's a general correlation. It's likely that speed is another factor, and park as well as GB rate. This graph is using the same sample of guys who had consecutive seasons with the cuttoff. This is year 1 LD% on the x-axis and year 2 on the y. Obviously there are only three years so 2005 showed up a lot. There were 178 such sets of players.

I cannot stress how important this graph is. It's saying that there's no year-to-year correlation for a players ability to hit linedrives. It's similar to saying that if Albert Pujols hits .330 one year and Yadier Molina hits .230 that the next season Molina is just as likely as Pujols to win the batting title. While a high LD% usually leads to more hits, a high LD% doesn't appear to be much of a consistent skill for a hitter.
If you do pitchers you're going to get similar results. You'll have a little bit of correlation for BABIP vs LD% in any given season, but you'll have essentially no correlation in BABIP or LD% from year to year which is vital from a projection standpoint. That's why HR power/groundball rates, even transient speed for a hitter, strike out rates, and walk rates are so vital for projections for hitters and pitchers. They're less likely to change season to season, although generally you lose footspeed as you age causing a higher expected BABIP as you beat out less GBs.
-
greenback44
- Hall Of Famer
- Posts: 11664
- Joined: June 26 06, 8:54 pm
- Location: In a Small Town with Jack and Diane
I've been wondering about the value of LD%. Oh, well.
FoxSports (one word!) bought out Dewan at STATS. He's at BIS now.
FoxSports (one word!) bought out Dewan at STATS. He's at BIS now.
- Hungary Jack
- Mother Earth
- Posts: 19537
- Joined: July 24 06, 6:03 am
- Location: In Cognito
Great stuff. Thank you, thank you, thank you.Mephistopheles wrote:Will do. This first graph has hitters LD% on the x-axis and hitters BABIP on the y-axis. Each data point represents one player-season from 2004, 2005 or 2006 who qualified for the batting title. This gave us 443 data points.Hungary Jack wrote:It would be cool if some of our stat gurus could plot BABIP vs. LD% and determine correlation and R-squared.
There's a general correlation. It's likely that speed is another factor, and park as well as GB rate. This graph is using the same sample of guys who had consecutive seasons with the cuttoff. This is year 1 LD% on the x-axis and year 2 on the y. Obviously there are only three years so 2005 showed up a lot. There were 178 such sets of players.
I cannot stress how important this graph is. It's saying that there's no year-to-year correlation for a players ability to hit linedrives. It's similar to saying that if Albert Pujols hits .330 one year and Yadier Molina hits .230 that the next season Molina is just as likely as Pujols to win the batting title. While a high LD% usually leads to more hits, a high LD% doesn't appear to be much of a consistent skill for a hitter.
If you do pitchers you're going to get similar results. You'll have a little bit of correlation for BABIP vs LD% in any given season, but you'll have essentially no correlation in BABIP or LD% from year to year which is vital from a projection standpoint. That's why HR power/groundball rates, even transient speed for a hitter, strike out rates, and walk rates are so vital for projections for hitters and pitchers. They're less likely to change season to season, although generally you lose footspeed as you age causing a higher expected BABIP as you beat out less GBs.
I was suspecting/hoping that the R-squared value would be higher, but it makes sense given that graph 2 essentially establishes that there is no such thing as a "line drive hitter", and that BABIP can vary significantly from year to year.
- Phyrkrakr
- All-Star
- Posts: 1515
- Joined: January 15 07, 2:51 pm
- Location: St. Louis
- Asmodai
- Veteran Player
- Posts: 1121
- Joined: February 9 07, 7:37 pm
The first one is suggesting that line drive percentage slighly corresponds to BABIP for a hitter. The second one suggests there aren't any consistent line drive hitters in the league.Phyrkrakr wrote:So, are the graphs above saying that there aren't any real line drive hitters in the league, or that line drive percentage doesn't really correspond to BABIP?
Ironically enough I'd expect LD% and BABIP to correlate higher for pitchers, just as is factors out speed to a certain extent.




