To read more advice, check out the advice section.
What books should I read if I want to get into sports analytics? Where is a good place to start?
There are certainly sport-specific books that I could list that help you understand the terminology and thought patterns in any given sport, but I’ll save that for another time. More important, in my mind, are books that can teach you how to think through a statistical lens.
Sometimes sports analytics gets reduced to whether you know how to code, or run a regression, or use machine learning methods to make predictions. And while those are skills that are helpful, they are worthless (or even harmful) if you don’t first understand the concepts behind statistical thinking. These concepts can tell you why these skills are helpful in the first place, where they came from, when to use them, and how to avoid their pitfalls. And even if you never learn the technical skills, these concepts can teach you ways to think about the world that will help you in any line of work—or even just living life.
To that end, here are some books I’ve read and would highly recommend that will help you along that path. Think of it as a mini-course in sports analytics:
Moneyball by Michael Lewis — the book that really started it all. An engaging story that will also help you understand some of the origins of why and how advanced statistics came to be so pivotal in sports.
The Undoing Project by Michael Lewis — Lewis writes in the introduction that he thinks of this as his follow-up to Moneyball. It examines the work of Daniel Kahneman and Amos Tversky, showing the underlying psychological reasons why statistical analysis is helpful in decision making. (And if you’re interested in more on this topic, you can read Kahneman himself in Thinking, Fast and Slow.)
The Drunkard’s Walk by Leonard Mlodinow — an entertaining look at randomness, probability, and how to think statistically, with the history of many of these concepts woven in.
How Not To Be Wrong by Jordan Ellenberg — a well written guide to using mathematical concepts to think about the world.
Superforecasting by Phil Tetlock and Dan Gardner — Tetlock’s research helps us understand how to make the best predictions we can, giving us a blueprint for integrating the numerical and the psychological.
The Signal and the Noise by Nate Silver — Silver writes about how predictions can work, and when they go wrong. His writing provides a very important conceptual underpinning to any predictive model building you’d want to do.
The Only Rule Is It Has to Work by Ben Lindbergh and Sam Miller — Bringing us full circle: What happens when two writers, steeped in the knowledge of baseball analytics, take over a minor league team and actually get to apply what they know? A very fun and enlightening look at the actual application of a lot of the theory you’ve just read.
Bonus
Thinking in Bets by Annie Duke — I have not read this book yet, but have heard it highly recommended from a few people I trust. You can also check out some of her interviews, for example this podcast episode.
More advanced
Incerto by Nassim Nicholas Taleb — Incerto is Taleb’s series of books, which includes Fooled by Randomness, The Black Swan, The Bed of Procrustes, Antifragile, and Skin in the Game. Think of them as the graduate level course after the prior reading. They are more in-depth, technically detailed, and challenging, but they opened a window for me into a very different kind of view of the world that I have found quite valuable.
Then what?
I’m a huge believer in the importance of domain knowledge to properly perform any statistical analysis (something reinforced by many of these books!). In this case, that means whatever sport you want to analyze, you need to understand it as well as possible.
Beyond that, as I mentioned, learning what has already been done in the field in the sport you’re focusing on is also important.
How to approach learning both of those is beyond the scope of this email. (Perhaps it’s a topic for a future one!) But I wanted to provide this list, because I think it’s a very valuable (and often overlooked) first step.