How the Astros Used Big Data to Win the World Series: “Astroball” (Ben Reiter)

I wrote this article in Japanese and translated it into English using ChatGPT. I also used ChatGPT to create the English article title. I did my best to correct any translation mistakes, but please let me know if you find any errors. By the way, I did not use ChatGPT when writing the Japanese article. The entire article was written from scratch by me, Saikawa Goto.



Movies and books covered in this article

(Click This Image to Go Directly to the sales page “Astroball”: Image from

I will write an article about this movie/book

Three takeaways from this article

  1. Astros, a team that was so weak that even the locals booed at the prophecy that they would win the championship.
  2. A data-centric approach like “Moneyball” alone cannot win.
  3. How can the system incorporate “inaccurate” data such as “intuition” and “luck”?

I have no interest in baseball at all, but this book was a fascinating read even for people like me.

Self-introduction article

Please refer to the self-introduction article above to learn about the person writing this article. Be sure to check out the Kindle book linked below as well.

Published Kindle books(Free on Kindle Unlimited)

“The genius Einstein: An easy-to-understand book about interesting science advances that is not too simple based on his life and discoveries: Theory of Relativity, Cosmology and Quantum Theory”

“Why is “lack of imagination” called “communication skills”?: Japanese-specific”negative” communication”

The quotes in the article were translated using ChatGPT from Japanese books, and are not direct quotes from the foreign language original books, even if they exist.

The “Worst Baseball Team in the Last Half Century” Won the Championship With “Big Data” and “Intuition”

The Author Who Predicted “Astros Win”

Let’s start with this story. The author made a prediction that the Astros would become the champions of the World Series in 2017.

There is a very famous sports magazine in America called “Sports Illustrated”, with a readership of 3 million people, including regular subscribers. And the author of this book used George Springer, an outfielder for the Astros, on the cover of the June 30, 2014 issue, titled “World Series Champion 2017”.

However, this issue caused a huge uproar, including in Astros’ home city. And it’s no surprise because at the time in 2014, the Astros were in a terrible state.

They were the worst baseball team in the last half century.

Despite this, the author thought, “I don’t understand why they have been losing,” and persuaded the editor-in-chief to make this unbelievable decision.

And, just as the author predicted, Astros became the World Series champions in 2017.

Pitcher Keuchel, Who Won the Cy Young Award

In addition, let me mention another impressive episode.

Let’s start with the story of pitcher Keuchel. He was already a member of the Astros before the involvement of data analysts to rebuild the team. His pitch speed was in the 140 km/h range, and he didn’t have any particularly impressive pitches, so he was not a memorable player. Keuchel was almost traded, but he has remained with the Astros simply because “nobody wanted him.”

However, in 2015, Keuchel won the Cy Young Award, which is given to the best pitcher of the year. How did such a dramatic change happen?

The data analyst who got involved with Astros was working on developing a system to “scout good talent”. By inputting various data, they could identify and acquire the talent the team needed. The same system could also be used for players who were already on the team, of course. Since the team’s players are easier to obtain data from, more accurate analysis is possible.

As a result of analyzing the team’s player data, the data analysts proposed the “defensive shift”. I’m not knowledgeable about baseball, but this “defensive shift” is now a technique adopted by all baseball teams. And what a surprise, Astros’ data analysts were the first to propose it.

This is a method where the fielding position is greatly changed for each batter based on data such as which direction the opponent’s player will hit the ball. At first, Keuchel opposed this “defensive shift,” but in fact, he was saved by it. He was a pitcher with no fastball, but an abnormally high level of control, so thanks to the combination of “control” and “defensive shift,” Keuchel was able to win the Cy Young Award.

This book tells the story of the struggles of “layperson” who achieved great feats in Major League Baseball in this way.

This is Not Just About Baseball

First of all, let me make this clear: this book is not just about baseball. Personally, I have no interest in baseball and only know the rules to some extent.

This work tells the story of “big data”, which will greatly change the world in the future. And from this book, we can gain an important understanding that “big data alone is not enough.”

If the Astros had run their team operations using only a computer, they wouldn’t have drafted the high school shortstop from Puerto Rico, signed the 168-centimeter second baseman, or acquired the forty-something free agent player or the mid-thirties pitcher who required a $2 million annual salary via trade.

One very fascinating point of this work is the observation that “big data needs to incorporate human intuition” – analyzing data alone does not reveal the essence. You may feel that this seems obvious, but I feel that there are still many people who think that analyzing big data alone can reveal everything.

Some people may understand that big data alone is not enough, but they may not know what to do about it.

This is a point where this book differs from “Moneyball.”

The movie starring Brad Pitt may be better known, but this work is based on the true story of the Athletics, a small major league team that also made a leap forward through data analysis. However, in “Astroball,” it is written as follows:

“Moneyball” mainly portrayed scouts as a resistance force and treated them as foolish and outdated people who stood in the way of progress.

It means that it is depicted with the stance that “despite data analysis is amazing, but scouts don’t understand anything.”

“Moneyball” is a story from a time when words like “big data” would not be widely known. In such time, it may have been necessary to create a clear-cut structure of “scouts are the enemy” in order to fight. But even if so, I think it is certain that they were biased towards “data supremacy”.

How to Build Intuition Into Their Data

In this book, the data analysts who appear are highly skilled, not only in their ability to analyze data, but also in their emphasis on “not relying too much on data”. And this book explores how to incorporate intuition into data analysis.

The two of them didn’t think they knew better than anyone else how to run a baseball team, and they didn’t want to believe they were smarter than others.

The feeling that we are smart is also our enemy. We are trying to avoid it by all means.

”When someone tells you they know what the future holds, don’t trust them,” says Sig. “The future is far more incomprehensible than we imagine. It is even more incomprehensible than we can imagine. Even if you think you have unraveled the future, wait a moment. You’re probably wrong”.

This book has many sentences like this.

Data is a very objective thing and does not lie. That’s why there is much to learn from its analysis. However, the data analysts in this book understood that they were baseball laypeople. They love baseball but do not have much experience. They knew that just being able to analyze data wouldn’t be enough to succeed.

That’s why they tried to incorporate things like “intuition” and “luck” into their data analysis.

Among the variables, there are some based on scouts’ intuition. Although intuition lacks consistency compared to things like radar gun numbers or slugging percentage, it may be just as valuable or even more so. This is because other teams also know just how to read radar gun numbers at least.

This expression “other teams also know just how to read radar gun numbers at least” is very important. This is because the success of the Astros has accelerated the movement to operate baseball teams through data analysis.

By increasing machines and equipment, data can be obtained “accurately” as much as desired. Therefore, differentiation cannot be achieved in this aspect. Rather what is important is the rule of thumb of what weight to give to the data which we can obtain just “inaccurately” and how to incorporate it into the system. Therefore, in Astros, not only “pitching form” and “swing trajectory”, but also variables such as “player’s personality” and “family history” were input into the system.

This book is really interesting in the description about how it handles such “inaccurate” data. Although this book doesn’t go into detail, you can come to mind a different scene from the dryness usually associated with “analyzing big data” if you read.

“Data-Izing” “Team Harmony”?

Furthermore, they tried to capture not only the individual ability of each player, but also intangible factors such as “team harmony” that are difficult to quantify with data.

Regarding this, the episode about a player named Carlos Beltrán was particularly impressive.

The team had no regrets about paying Beltrán’s $16 million salary to help boost the team in ways that were invisible and often unclassifiable, but his results on the field were not worth the numbers.

In this series that was entangled up to the seventh game, Beltrán had 12 at-bats and only got one hit. However, if he wasn’t there, Astros might not have won the ALCS.

As these quotes indicate, he isn’t producing results on the field as a baseball player. Nonetheless, the team thought it was worth paying him $16 million a year for his contribution to “team harmony.” They have a sense that “having him around should improve the team’s cohesion” and prioritize that “gut feeling.”

However, the book also states as follow.

There is currently no way to predict that Astros would not have been able to reach the top if Beltrán had not been there, despite not being able to hit any hits in the World Series.

Data analysts were ultimately unable to demonstrate that “team harmony” was in a good state because of Beltrán’s presence. Perhaps the Astros would have reached the top even without him. It is impossible to choose two or more options at the same time in most situations. In the end, someone has to make the decision ultimately.

There should be various challenges around the world on how to utilize “big data.” As one perspective, the idea presented in this book of “incorporating intuition into big data” is very intriguing.

The Reform that was Possible Thanks to Being the Worst Team

Now, the data analysts featured in this book are described as “laypeople” in baseball. So, you might wonder, why would they accepted the opinions of such laypeople?”

One of the factors that led to this was the fact that the team Astros was just too terrible. As mentioned earlier, the Astros were the “worst baseball team in last half century.” And that is why I can say that they were able to take up the desperate challenge even and saying, “If I don’t rely even on laypeople’s ideas, I won’t be able to handle it anymore.”

In other words, a baseball team that was too lousy to handle on its own and a data analyst who likes baseball but has no experience teamed up to create a miraculous environment. Because significant results were achieved in a relatively short period of time, they were successful in raising the awareness of the entire team as follow.

Perhaps the most important thing is that the team has a mindset of actively accepting new ways to predict and change the future.

Because the data analysts were able to try to do his own full ability, they were able to implement measures that would normally not be realized, such as “incorporating intuition into big data”.

The positive results of Astros show that success is not a matter of “human vs machine,” but rather a matter of “human + machine.”

This can be seen as a big social experiment for us who have no choice but to face “big data.”

On the other hand, there is some unfortunate news regarding the Astros. In fact, it was accused that they used electronic devices to steal signs from the opposing team’s catcher and relay them to their batters when they won the World Series in 2017. The translator of this book wrote in the afterword as follows:

If you read this book after knowing about the cheating, it’s hard to deny that there will be doubts about the credibility of some of the contents.

It certainly feels that way, but it doesn’t seem like the people depicted as the main characters in this book were involved in it. It also doesn’t seem fair that the Astros’ great achievements should be zero-rated just because of this incident.


When we hear the term ‘big data,’ it may seem dry and uninteresting, and “data analysis” can give us a cold impression. However, this book questions “how to incorporate humans into data” which is a big question we need to consider for the future, beyond just the world of baseball. I feel that the outcome of their challenge could greatly change not only the world of baseball but also other areas, and that could make society more interesting.

Published Kindle books(Free on Kindle Unlimited)

“The genius Einstein: An easy-to-understand book about interesting science advances that is not too simple based on his life and discoveries: Theory of Relativity, Cosmology and Quantum Theory”

“Why is “lack of imagination” called “communication skills”?: Japanese-specific”negative” communication”

  • URLをコピーしました!
  • URLをコピーしました!