Data-Driven Descriptions: Using Machine Learning to Profile 2020 NBA Draft Prospects

Research and Post By Sameer Sapre

If you have ever read, listened to, or watched analysis of an NBA draft you might have heard some strange sounding phrases like “3 – and – D wing”, “rim-protector”, “pure scorer”, “raw athlete”, and “playmaker” used to describe a player. What does that mean? It seems like these adjectives are all meant to do one thing – “profile” a player. Profiles are a quick summary or composite description of who a player is, expressing their playing style, strengths, weaknesses, role on a team, etc. From a fan’s perspective, these descriptions give us an idea of the player’s projected role on an NBA team without having to go back and watch hours of the player’s games. In addition, it might help us fans identify players that fit a need on our team. For example, if you’re a Cavs or Blazers fan, a solid perimeter defender may be what you’re looking for. If you’re a Sixers or Thunder fan, you might be interested in knowing the sharpshooters that could help your team. As a fan of both college basketball and the NBA Draft, these profiles and descriptions are intriguing, and I’d love to take a crack at developing my own. The only problem is that I haven’t watched nearly enough college hoops, nor have I been paying enough attention to this year’s prospects. Maybe I should leave the analysis to actual analysts on TV, but part of me still thinks I can come up with useful player profiles using data. In this post, I’ll attempt to address the idea of “profiling” NBA draft prospects (specifically, guards), using an unsupervised machine learning technique called hierarchical clustering and the season statistics of NCAA prospects dating back to 2011.

 

What is Clustering?

The reason I am using hierarchical clustering is because the clusters that a player is assigned to can reveal the defining characteristics of that player. Without getting too technical, hierarchical clustering is an unsupervised machine learning technique used to divide observations in a dataset into clusters or groups based on statistical similarity. By grouping similar players together and evaluating the groups, we can generalize the qualities of players in each group. For example, one group may consist of players with a high 3-point percentage and low assist percentage revealing that they were generally effective as off-ball shooters for their college team. Of course, not all players have skill sets that can be identified with a given set of statistics or any available statistics for that matter. However, that is one of the challenges of generalizing player profiles. In this analysis, I do my best to mitigate these issues, but there are still a few players whose resulting profiles don’t make a ton of sense.

Unlike supervised models, trying to find out if the results of a cluster analysis are “good” doesn’t come down to prediction accuracy or error, but rather how similar observations are to others in their cluster and how different they are from those outside. There are ways to validate resulting clusters, like using Silhouette scores or Dunn’s Index, that judge if the resulting groups actually contain mathematically similar observations. If this is getting too technical, don’t worry, the bottom line is that players that are grouped together should generally share more in common then players not grouped together.

However, for this post, mathematically “good” results will not be prioritized over interpretability. In fact, my criteria for success is not technical at all. In order for this analysis to be a success, the resulting clusters/groups must be interpretable and must not be indicative of NBA success. That means that each group of players should have defining characteristics that can be described using basketball terminology and that each group includes players with varying levels of NBA success. Again, the goal of this analysis is build profiles that describe a player, not predict his chances of success. By the way, if you haven’t already noticed I use “cluster” and “group” interchangeably, sorry for any confusion but they mean the same thing.

Approach and Data

Rather than using only the college statistics of the 2020 class, I am including the college stats of current NBA players to make the resulting groups more interpretable. Of course, I will be using the average statistics of each group to get a better understanding of the group’s defining traits. However, by including current NBA players in the analysis, each group’s characteristics become more recognizable. For example, any group headlined by Buddy Hield or Joe Harris could probably be identified as a group consisting of primarily sharpshooters while a group including Matisse Thybulle and Marcus Smart could be thought of as a group of good on-ball defenders. In addition, the model does not consider the year that each player was drafted meaning that Anthony Edwards, Tre Jones, and Cole Anthony will be grouped together with similar players from previous draft classes. As a result, we’ll also get an idea of each 2020 draft class member’s NBA comparisons. Of course, there are always going to be players whose college profile is different from their NBA profile, but that doesn’t seem to affect the results too much.

If you want to check out the technical details/data selection, it will be available on GitHub. Long story short, I used the statistics of the final college year of almost every single guard that has played in the NBA since 2011 as well as guards included in NBADraftNet.com’s 2020 rankings. Unfortunately, this analysis does not include players that did not play in the NCAA. That means no LaMelo Ball, Killian Hayes, Theo Maldeon, or RJ Hampton nor does the analysis include any current players that played overseas instead of in college. That means players like Bogdan Bogdanovic, Dennis Schroder, or Emmanuel Mudiay will be left out as well.

Next, as un-inspiring as it sounds, I decided to hand-pick statistics that I felt were most important when trying to discern the various roles and playing styles of guards among each other. The final set of statistics I used were assist percentage (AST %), usage percentage (USG%), three-point attempt rate (3PAr), effective-field goal percentage (EFG%), and defensive box-plus-minus (DBPM).

 

Results

The final model produced 6 clusters of players that generally make sense and can serve as statistical profiles. It was by no means a complete success as there were some players in questionable/interesting groups, but, as a whole, it wasn’t too difficult to come up with descriptions for each cluster using group averages and the players within them. Here is an overview of each group’s statistical profile.allgroups

 

Group 1: Low Efficiency 3 – and – D

In our first group, players tended to be solid defenders, but weren’t very efficient scorers, nor were they the primary creators for their team. The group includes notable NBA starters like Donovan Mitchell, Gary Harris, and Kentavious Caldwell-Pope, but also contained players that didn’t make much of an impact at the next level like Rawle Alkins, Aaron Harrison, and Malachi Richardson. The best-case scenarios for these guys don’t look too bad. Harris and KCP were both starters for the two Western Conference Finals teams with KCP going on to win the finals with LA as a key contributor on both ends. He was also part of one of the best defensive units in the league. It’s also surprising that Donovan Mitchell was included in this group. He has now become the offensive focal point of the Utah Jazz and an All-Star in the process of shedding this label. It’s also worth noting that all of the top 3 players are not known to be particularly lethal shooters, but tend to be streaky shooters capable of going on hot/cold stretches at a moment’s notice. Streaky shooters are notorious for their willingness to shoot despite their recent struggles. Therefore, the clustering did a good job of grouping players that will continue to exhibit a high three-point attempt rate regardless of the percentage they are shooting.

The 2020 prospect that fell into this group was Isaiah Joe, a 6’5 guard from Arkansas whose efficiency (49.7 EFG%) isn’t great, but he did shoot a lot of threes (76.4 % 3PTAr).

 

group1
Statistical profile of players in Cluster/Group 1
Group 2: Old-school floor generals

For group 2, we can find offensively efficient primary ballhandlers/creators given the groups relatively high effective field goal percentage, low usage percentage, and high assist percentage. Players in this group include Denzel Valentine, Derrick White, and Reggie Jackson as well as Scott Machado and Ray MacCallum. These players really made sense when looking at the group averages. They seemed to be making good decisions with the basketball, assisting a large amount of teammate field goals while using up a relatively small share of possessions (turnovers are also included in usage percentage). In addition, despite their lack of three-point shooting, they still shot the ball very efficiently inferring that they took smart shots and often found higher percentage looks. While no one assigned to this group is a star, there are still solid role players and starters in the NBA that carried this label in college.

The only 2020 prospect assigned to this group was Oregon’s Payton Pritchard. who shot threes at a decently high rate (45.9 % as a senior) along with solid efficiency numbers.

group2
Statistical profile of players in Cluster/Group 2

 

 

Group 3 – High Volume Scorers

In group 3, we primarily found what some might call volume scorers. These players had a high usage rate and a low assist percentage suggesting that they used a large portion of their team’s possessions to shoot or turn it over. They also carried okay scoring efficiency and subpar defense. Notable NBA players include Buddy Hield, Damian Lillard, and Jamal Murray while fringe players include Xavier Munford, Rashad Vaughn, and Gian Clavall. It’s important to note that while the statistical profiles of players in this group don’t seem great, some of them have still gone on to become solid NBA contributors. Damian Lillard has become a superstar and can get quality shots from almost anywhere on the court. Hield, despite his high volume at Oklahoma, was still a very efficient shooter (0.623 EFG%) and was a key contributor for Sacramento before issues with the coaching staff. Finally, Jamal Murray exploded onto the scene in this year’s playoffs helping Denver to the Western Conference Finals with a ridiculous 62.6 True Shooting percentage and a stretch of 3 games in which he scored a total of 142 points.

The 2020 prospects assigned to this group were Markus Howard (Marquette) who has a high usage rate (39.3), low DBPM (0.6) and decent efficiency (53 EFG%) and Anthony Edwards (Georgia: we’ll get to him later).

group 3
Statistical profile of players in Cluster/Group 3

 

 

Group 4 – Focal Points (… No pun intended)

Group 4 players looked clearly like high offensive load bearers as they had high usage and assist percentages. That combination signifies that much of the offense ran through them as they worked as the primary facilitators and shot at high volumes. Also, these guards didn’t take many threes, weren’t super-efficient, nor had great defensive numbers. Notable NBA include Trae Young, DeAngelo Russell, Ja Morant, Klay Thompson, and Dejounte Murray and fringe players Walt Lemon, Milton Doyle, and Mike James. Now you may be questioning Trae Young and Klay’s inclusion in this group, but both carried high offensive loads and weren’t that efficient, the only difference is that their three-point attempt rates were very high.

Nevertheless, what you can take away from this group is that it’s best players have no issue handling the scoring and creation responsibilities at the next level. Trae Young and DeAngelo Russell are already All-Stars with high usage and assist rates while Ja Morant seems to be following a similar trajectory in Memphis. Maybe if a player like this is given the reigns on a team in need of someone to shoulder that load, they can thrive.

2020 prospects include: Cassius Winston (Michigan State), Saben Lee (Vanderbilt), Jamil Wilson (Marquette), Cole Anthony (UNC), and Grant Riller (Charleston).

group4
Statistical profile of players in Cluster/Group 4
Group 5 – Lockdown Combo Guards

Group 5 consists primarily of defensive specialists who were also the primary offensive facilitators on their team. Players of this group generally have high DBPM, and high AST% while not being the most efficient shooters. Notable NBA players include Marcus Smart, De’Aaron Fox, Shai Gilgeous-Alexander, Delon Wright, Matisse Thybulle, and Malcom Brogdon while fringe players include Tyrone Wallace, Travon Duval, Troy Caupain. There seem to be players like this available all over the draft from pick # 5 (Fox, Smart) to pick # 36 (Brogdon). I am a big fan of this group because it contains many solid, underrated players. Shai Gilge….. is a fun player to watch and might become the cornerstone for the Thunder franchise. Marcus Smart is such a good defender that there was an argument that he should’ve been the Defensive Player of the Year. Finally, 2020 prospects to watch are Devon Dotson and Malachi Flynn who seemed to fit this statistical description pretty well.

There were quite a few 2020 prospects assigned to this group including Ashton Hagans (Kentucky), Devon Dotson (Kansas: 4.8 DBPM), Malachi Flynn (4.1 DBPM, San Diego State), Josh Green (Arizona), Tre Jones (Duke), and Tyrese Maxey (Kentucky: 1.45 AST:TO Ratio).

group5
Statistical profile of players in Cluster/Group 5
Group 6 – Efficient 3 – and – D Wings

Finally, players of Group 6 look like they can be true 3-and-D wings. These players had very high EFG% to go with a high 3PAr. They also carried a decent DBPM while carrying low usage and assist rates. Notable NBA players include Bradley Beal, Devin Booker, Joe Harris, Tyler Herro, Terrance Ross, as well as Lonzo Ball and Victor Oladipo (64.8% EFG, 6.2 DBPM). This may seem strange, but Lonzo Ball was very efficient as a shooter (66.8 % EFG), shot lots of 3s (56.6 % 3PAr), and was a great defender (3.9 DBPM), he just happened to also have a very high assist percentage (31.4%). Overall, these also seem to be the most “NBA ready” players in the draft. Most of the top players of this group were starters right away. Most recently and perhaps notably, 19 year old Tyler Herro started every game for the Eastern Conference champion Miami Heat. This early success might be due to the shift of the game as a whole. As teams have started embracing the three-point shot as more of a necessity rather than an option, players who are good 3-point shooters have naturally become more valuable in today’s game.

2020 prospects include Tyrese Haliburton (Iowa State), Tyrell Terry (Stanford: 45.6% 3Par, 20 AST%, 53.5% EFG), Immanuel Quickly (Kentucky), Desmond Bane (TCU), and Cassius Stanley (Duke).

group 1
Statistical profile of players in Cluster/Group 6

Conclusion

There are a few points to mention with these results. First, It looks like there are plenty of solid defenders to be found in this upcoming draft (both primary ballhandlers and shooters) by the large representation of 2020 prospects found in clusters 5 and 6. Second, some notable players I want to analyze further include Anthony Edwards and Tyrese Haliburton.

Edwards was placed in a group occupied primarily by volume scorers. He will be a top-3 pick, but will teams consider his lack of defensive impact (0.7 DBPM) and low efficiency (47.3% EFG)? Of course, players can improve and maybe the top of the draft is a perfect spot for high usage, low efficiency “projects”. Teams at the top of the draft may be more willing to give a longer leash to prospects and being on a bad team might give Edwards opportunities, in terms of volume, that could help him develop. Just look at fellow top-10 picks in his group – Damian Lillard, Buddy Hield, Jamal Murray, Brandon Knight, Austin Rivers. While Edwards’ future team hopes it’s not the latter two (By the way, Knight had a promising start to his career before injuries got in the way), the first three may be indicators of how his team should handle his development. For this reason, perhaps Minnesota, a team hoping to make a push for the playoffs and maximize the opportunity they have with KAT and DeAngelo Russell, should opt for someone who will not command a high volume. Honestly, the same can be said for Golden State, Edwards will likely have to cede volume to Klay, Steph, Draymond, and Andrew Wiggins and will also be expected to contribute immediately to a deep playoff push next year. I could see Charlotte, despite solid guard play from Devonte Graham and Terry Rozier this season, being a good fit for Edwards. They don’t seem close to competing for anything just yet and could be the perfect landing spot for Edwards to get the opportunities he needs to develop.

In addition, the inclusion of Tyrese Haliburton as a 3 – and – D wing is also interesting. He has a high assist percentage (35%) and effective field-goal percentage (61.1%) while carrying a relatively low usage rate for a point guard (20.1%) and shot about half of his shots (50.8%) from downtown. This might mean that he has a versatile skill set and could serve a team in multiple ways in the NBA. For teams that already have young point guards like Chicago or New York, Haliburton might still be a good fit for operating in some sort of hybrid role. Even teams with a perceived need for a point guard like Detroit or Phoenix, could use him as their primary ballhandler.

In conclusion, the groups produced by the hierarchical clustering model met the goals originally defined for them: they were interpretable, and each cluster contained players with different levels of NBA success. Of course, the results weren’t perfect as a few players seemed to be placed in groups unintuitively, but not all basketball players can be easily profiled with a small set of statistics. Nevertheless, I hope that this post provided a new perspective on player profiling using a more statistical approach.

All data used for this project was obtained from Basketball-Reference.com.

Perfection Ranked: Greatest Perfect Games #1-5

Perfect Game Collage 4

by Mallet James and Kyle Kroboth

In the final installment of our 4-part series where we attempt to rank all 21 perfect games in the modern era using a numbers only approach, we take a look at the most unlikely of all perfect games. In this section, we explore Don Larsen and the 1956 world series, recent names like Dallas Braden and Mark Buehrle, and reveal the perfect game that should have never happened. We introduced our ranking method using Bradley-Terry in our introductory post.

 

5. Dallas Braden vs. Tampa Bay Rays B-T Probability: 3.2 in 100,000

Perfect Game Braden

Dallas Braden wasn’t a household name when he threw his perfect game for the Oakland Athletics on Mother’s Day 2010 versus the Tampa Bay Rays, but unlike fellow perfect game thrower Phillip Humber, he was a solid middle of the rotation pitcher. He wasn’t overpowering; he ended the 2010 season with a 5.3 K/9, which he overperformed in his perfect game with six strikeouts, still tied for third lowest of any perfect game. But he compensated for his inability to miss bats with pinpoint control and an ability to induce weak contact: he finished 6th in the MLB in BB/9 and and 16th in the MLB in WHIP. Braden wasn’t elite, but wasn’t quite the stiff that he is often portrayed as.

What made Braden’s perfecto especially unlikely was the composition of the opposing Rays lineup that day. Headlining the Rays’ starting nine were two 2010 All-Stars: prime Evan Longoria and Carl Crawford. People forget just how good those guys were in the late 2000s; Longoria finished 6th in the MVP race and was 11th in the American League in on-base percentage. Crawford landed one spot behind Longoria in the MVP race and even won an outfield Silver Slugger award in his final season in Tampa Bay before the Boston Red Sox signed him to a disastrous 142-million dollar free agent deal in the offseason.

The Rays had depth beyond their headliners as well. Guys like Ben Zobrist, Carlos Pena, and Jason Bartlett each added 2+ wins and got on base at an above average clip. The Rays ended the season with 96 wins and in first place in the AL East, their only division title since their 2008 World Series run. The A’s ended the season at .500.

4. David Cone vs. Montreal Expos B-T Probability: 2.2 in 100,000

Perfect Game Cone

David Cone threw the Yankees’ third perfect game on July 18th, 1999 at Yankee Stadium in front of almost 42,000 fans. In 1999 Cone was nearing the end of his 17 year MLB career in which he built a consistently strong resume that featured 5 world series wins, 4 of which came on the back end of his career with the Yankees. He is a Cy Young award winner and 5 time all star so the high unlikelihood of his perfect game coming at a time when he was pitching some of his best, although late in his career, seems like it may have been mostly due to the lineup he faced. Cone though was not all that proficient at keeping runners off base in 1999. His POBP of .320 ranks third highest of any pitcher to have thrown a perfect game and the Expos lineup really wasn’t as bad as you might think.

The Expos team that Cone faced at the time was 33-55 and ended the 99′ season with a 68-94 record, finishing 4th in the NL East. A poor record but mostly due to an inexperienced pitching staff that featured what would end up being a lot of no name guys that averaged an age of 24 years old. The batting lineup was actually quite strong top to bottom at the time and was headlined by a young Vladimir Guerrero.

Vlad was just embarking on a hall of fame career in a 99′ season where he posted a .978 OPS, won his first silver slugger award, and made his first all-star game appearance. Apart from Vlad, the Expo names don’t jump off the paper but there were some serious hitters in the lineup including Rondell White, who put together three straight years with an offensive WAR above 2.5 from 1997-1999. Jose Vidro was in the lineup as well, a solid late 90s, early 2000s infielder who was an all-star 3 out of 4 years from 2000-2003, in 1999 he posted an OBP of .346.

The Expos average OBP against Cone was .332 which is the 4th highest average lineup OBP out of all lineups that have been defeated by a perfect game. .332 would be an above average OBP by today’s standards, it was slightly below the .345 MLB average in 1999 but the Expos lineup, sneaky as it might have been where a formidable opponent for Cone. He navigated his way through the test that they posed and earned his spot quite high up on our list of most unlikely perfect games.

3. Don Larsen vs. Brooklyn Dodgers B-T Probability: 1.8 in 100,000

Perfect Game Larsen

Larsen threw the most famous perfect game of all time against the Brooklyn Dodgers on October 8th, 1956 in game 5 of the 1956 World Series.

I know what you’re thinking, how is a perfect game thrown in the World Series not the most unlikely perfect game of all time? Larsen’s perfect game is almost unanimously ranked as the greatest perfect game of all time (I think it is too) but as you know by now we are taking a different approach. Certainly we could add some weight to the calculation to account for postseason games, World Series games even; and maybe we should have done just that but when accounting for batter and pitcher performance there are two perfect games that rank slightly ahead in unlikelihood. This takes absolutley nothing away from the feat, it may be one of the few untouchable games of all time but hopefully our take will offer some perspective and make you think a little differently about the best perfect games thrown in the modern era.

World Series or not Larsen faced a stellar Dodgers lineup. It was a lineup headlined by Duke Snider, Jackie Robinson, Gil Hodges, and Roy Campanella to name a few. The Dodgers were 93-61 on the year and outscored their opponents by 119 runs. When you just consider the names involved it may be the best core of a lineup to lose by way of a perfect game of all time. Larsen needed just 97 pitches to get through the Brooklyn lineup three times. He struck out 7 batters, no Dodger struck out more than once, and Jackie Robinson was one of the few Brooklyn Bats not to strike out.

Larsen was a career 81-91 pitcher, pitching for 8 different teams over his 14 year MLB career. He was not the ace that you might think of when you consider a World Series perfect game pitcher, splitting his time between the back end of starting rotations and the bullpen. 1956 ended up being the best year of his career as a starting pitcher, he went 11-5 that year after posting a 3-21 record with the Baltimore Orioles just two years prior. Many of the usual indicators of success are all pretty average marks for Larsen, his ERA+ for his career was 99, right on average, his FIP was 3.94 and he had K/9 of 4.9. The most noteable statistical find when it comes to Larsen’s career is how good of a hitter he was. He had a .242 average with 14 home runs and was used quite often as a pinch hitter.

For what was a fairly pedestrian career as a whole, Larsen’s sparkling perfect game will have his name enshrined in postseason baseball history forever.

2. Mark Buehrle vs. Tampa Bay Rays B-T Probability: 1.7 in 100,000

Perfect Game Buehrle

Mark Buerhle threw his perfect game versus the Tampa Bay Rays on July 23rd, 2009, less than one year before Dallas Braden threw his versus seven out of the same starting nine. How this Rays team is the only one to be perfect gamed twice in less than a year is a mystery: the 2009 Rays had an even higher team OBP than the division winning 2010 squad. Only two teams on the losing end of a perfect game had a higher team OBP than the Rays: the 1922 Tigers versus Charlie Robertson and the 2004 Braves versus Randy Johnson

Evan Longoria and Carl Crawford still had nice seasons in 2009, both finishing in the top 30 of AL OBP. But surprisingly, they were not the guys that got on base most. That award goes to Ben Zobrist, who finished fourth in the American League with an elite OBP of .405. Jason Bartlett was not far behind at .389, good for 12th in the AL. A team full of on-base machines shouldn’t have perfect games thrown against it, making Buehrle’s accomplishment one of the most unlikely in MLB history.

You might expect the number two pitcher on this list to be ineffective or inexperienced, but that was not the case for Buehrle. In fact, Buehrle was in the middle of his tenth out of sixteen seasons and only nine days removed from pitching the third inning of the MLB All-Star Game. He finished the season with an ERA of 3.84: not an elite number, but good enough to be a solid number two or three in a rotation. What really drives Buehrle down the list is his .311 POBP. That’s not a bad number by any stretch, but it ranks 16th out of the 21 pitchers to throw a perfect game, a testament to how good most perfect game pitchers are.

1. Charlie Robertson vs. Detroit Tigers B-T Probability: 0.9 in 100,000

Perfect Game Robertson

In the most unlikely perfect game of all time Charlie Robertson of the Chicago White Sox shutout the Detroit Tigers on April 30th, 1922 at Navin Field in Detroit.

We’ve talked about POBP and OBP quite a lot throughout the perfect games ranked series and how important it is to consider both sides of the story when considering what truly is an unlikely perfect game. There have been games ranked relatively high on this list due to poor pitching resumes and there have been games ranked just as high due to the strong lineup that a pitcher had to navigate to get all 27 outs. Charlie Robertson’s perfect game is the perfect storm of sorts when it comes to these two elements. It was a brutally tough lineup matching up against a very weak opposing pitcher. Charlie Robertson’s POBP was .332, the 2nd worst among all perfect game pitchers behind only Phillip Humber. He faced a lineup with an OBP of .358, by far the best of any lineup to see a perfect game. That lineup ended up beating out the NL champion Brooklyn Dodgers that Don Larsen faced in the world series by quite a large margin. It is a surprise that Charlie Robertson got one of his 49 MLB career wins against this Tigers team let alone found a way to throw a perfect game against them. I imagine if bets were being made on this game in 1922 there would have been quite a few folks interested in the -1.5 line in favor of the Tigers. There is almost no way this could have happened.

Robertson was an 8 year major leaguer who started upwards of 20 games in only 5 of them. 1922 was his first full year in the big leagues and his perfect game came in just his 5th ever game started. The game and his pitching performance truly came out of nowhere. He shutdown a Tigers lineup that featured some of the games all time greats including winner of the triple crown and hall of famer Ty Cobb. Ty Cobb might be the guy you would choose out of all major league hitters, all time, to pinch hit for your team with two outs left in the 9th to break up a perfect game. Somehow Charlie Robertson found a way to keep him off base 3 times that day. The other hall of famer in the Tigers’ lineup was Harry Heilman who slashed a stunning .356/.432/.598 in 1922.

Ty Cobb is famously known to have complained to the umpires about Robertson possibly doctoring the ball but they investigated his uniform, checked multiple balls, and never ended up finding anything substantial.

Beyond the perfect game, Charlie Robertson achieved basically nothing of note the remainder of his career. His perfect game is certainly the most mind-boggling of all time. He holds the title of most unlikely perfect game, a perfecto that was almost 2x more unlikely than Mark Buerhle’s runner up on this list. Robertson’s perfect game speaks to how random these events are and how anything can happen in major league baseball on any given day which is what makes the game so great.


References

Baseball Reference. Retrieved from https://www.baseball-reference.com/

Baseball Almanac. Retrieved from https://www.baseball-almanac.com/pitching/piperf.shtml

Fangraphs. Retrieved from https://www.fangraphs.com/

Image Citation

Dallas Braden. Retrieved from https://www.sbnation.com/2010/5/9/1464953/dallas-braden-no-hitter-perfect-game-athletics-vs-rays-2010-score-recap

David Cone. Retrieved from https://www.nydailynews.com/sports/baseball/yankees/yankees-david-cone-throws-perfect-game-yogi-day-1999-article-1.2294698

Don Larsen. Retrieved from https://www.sandiegouniontribune.com/sports/mlb/story/2020-01-01/don-larsen-perfect-game-yankees-1956-world-series-point-loma-high-dies-at-90

Mark Buerhle. Retrieved from https://www.daily-chronicle.com/2019/07/22/same-old-buehrle-former-white-sox-lefty-downplays-perfect-game-ahead-of-anniversary/axjbhi1/

Charlie Robertson. Retrieved from https://theledgesports.com/2015/06/10/ranking-10-best-no-hitters-all-time/mlb-perfect-games-charlie-robertson/

Perfection Ranked: Greatest Perfect Games #6-10

Perfect Game Collage 3

by Mallet James and Kyle Kroboth

In the penultimate installment of our 4-part series where we attempt to rank all 21 perfect games in the modern era using a numbers only approach, we take a closer look at some more of the middle tier likelihood perfect games. In this section, we explore both ends of the perfect spectrum from an ace like Randy Johnson to relative unknown Phillip Humber. We introduced our ranking method using Bradley-Terry in our introductory post.

 

10. Catfish Hunter vs. Minnesota Twins B-T Probability: 6.3 in 100,000

Perfect Game Hunter

Jim “Catfish” Hunter threw his perfect game for the Oakland Athletics on May 8th, 1968 versus a league average Minnesota Twins team that fielded a formidable 2-3-4 all-star punch of Rod Carew, Harmon Killebrew, and Tony Oliva. Hunter had no problem with Killebrew, who struck out in all three of his at bats amidst an embarrassing season that saw him hit .210 and miss six weeks after tearing his hamstring in the All-Star Game. Carew, the reigning Rookie of the Year, made the All-Star Game as well but fell into a bit of a sophomore slump; his .273/.312/.347 slash line was at or near his career low in each category. Hunter’s perfect game would have been far less likely the following season, in which Carew hit .332 and won the batting title. In 1968, Oliva was putting together another elite year in his underrated career, hitting .289/.357/.477.

Hunter’s perfect game took place in the middle of the “Year of the Pitcher,” the 1968 season that saw pitchers dominate hitters in most statistical categories. Carl Yastrzemski’s .301 batting average remains the lowest ever average for batting title champion, and Oliva’s .289 mark finished third in the league behind him. Bob Gibson had one of what was regarded as one of the most dominant seasons of all time, with a 1.12 ERA and 368 strikeouts. The season led to multiple rule changes going into the 1969 season, including shrinking the strike zone back to 1950s levels and lowering the pitcher’s mound from 15 inches to 10. It is worth noting that Hunter’s perfect game was thrown under the rules and specifications most historically favorable to pitchers.

There were several notable details and firsts in Hunter’s perfect game. One of the coolest bits wasn’t on the mound at all, but rather at the plate: Hunter went 3 for 4 at the plate with a double and 3 RBIs, making him the pitcher with the best batting line during a perfect game. The game was played in Oakland during the Athletics’ first season there, but the 13-12 A’s weren’t very popular yet: there were only around 6,300 fans in attendance, the smallest crowd at any perfect game ever. Hunter, at age 22, is also the youngest to ever pitch a perfect game.

9. Kenny Rogers vs. California Angels B-T Probability: 6.2 in 100,000

Perfect Game Rogers

On Thursday, July 28th, 1994 Kenny Rogers threw his perfect game at Arlington Park in Texas returning the favor to the Angels who perfected the Rangers 10 years earlier. It is the only instance of two major league teams throwing perfect games against eachother to date.

Kenny Rogers had quite a career and one that is unique in a lot of ways. The first interesting point that stands out is that instead of peaking just once in his career like many pitchers on this list and really any MLB player or pro athlete, he found a way to find great success in his career twice, both early in his career and well into his 40s later on. Rogers was a very steady major league pitcher throughout his 20 years in the bigs but his career is bimodal in a sense where he was at his best around normal peak years when he was 28-30, when he threw the perfect game, but also found his greatest form and maybe is most well known for what he did in his early 40s with the Detroit Tigers. Earlier on in his career his “stuff” might have been a little better as he posted higher strikout rates but he is a pitcher that throughout his career didn’t rely too heavily on the strikout and found a way to get it done with weak contact never having to really overpowered the batter.

During the perfect game Rogers was up against a lineup that he struck 8 times. The lineup included some notable batters such as Bo Jackson, Jim Edmonds, and Chili Davis, all who it the ball well that year and got on base at a high clip, especially Chili who owned an OBP of .410 that year, the second-best mark of his career.

Chili Davis only saw 5 pitches from Rogers, his failure to work the count may have really aided Rogers in the feat as he was a very dangerous hitter at the time. Whether it was by way of luck or excellent planning Rogers’ ability to get through Davis quickly was certainly a key to his success on the day. As Tom Tango recently noted by way of Bill James, there are only 9 players that have logged 1000 or more games as a DH in MLB history. Chili Davis is 7th on that list, of which 5/9 are hall of famers or will be hall of famers in the case of David Ortiz. Chili Davis is not a hall of famer but if you’re going to be penciled into the starting lineup 1000 or more times as a designated hitter you have to be able to swing the bat and in 1994 that was certainly the case.

A 39th round draft pick, Kenny Rogers is a great example of a late round success story, his longevity and skill led to playoff wins and many achievements including one of 21 perfect games.

8. Randy Johnson vs. Atlanta Braves B-T Probability: 5.6 in 100,000

Perfect Game Johnson

As we start inching closer to the top of our most unlikely perfect game countdown you might expect to see pitchers with less experience and maybe even some names known for the perfect game they threw and nothing else. This one does not fit that picture, in fact it doesn’t come close.

In May of 2004 Randy Johnson threw a perfect game against a Braves lineup that would go on to win the NL East and post a 96-66 record on the year. It was an all time great hall of famer against a lineup with some big time names and hall of famers in their own right. The lineup included Chipper Jones, Andruw Jones, and a very strong J.D. Drew. Andruw Jones makes an interesting hall of fame case, Chipper Jones is a hall of famer, and J.D. Drew had if not the best year one of the two best years of his career in 2004, finding his way on base a ton at the time.

J.D. Drew never really hit lefties all that well during his career but in 2004 he had a 142 wRC+, .929 OPS, and a 18.9 K% against lefties, a tough out for anyone at the plate even if you’re The Big Unit.

Randy Johnson struck out 13 Braves in his perfect game, striking out every batter at least once except for the aforementioned Andruw Jones and utility-man Mark DeRosa. Johnson had a K/BB rate of 6.59 in 2004 which was far and away the best of his career and really helps when it comes to finding a groove and retiring 27/27 batters in a major league outing. His WHIP was sub-1 in 2004 as well at 0.90, also a career best, command was not a problem for Johnson which was a huge key to success for the flame-throwing lefty.

Johnson was an ace in every meaning of the word, he led the major leagues in strikeouts 8 times between 1993-2004. To go along with that he has a career HR/9 of 0.9 meaning he gave up less than 1 home run per nine innings, those kind of numbers put him in a limited group of all time greats. He won 303 games and led his league in ERA+ (ERA adjusted to league ERA and player’s ballpark) 6 times. No matter what numbers you look at it is clear that Randy Johnson was a dominant major league talent. The perfect game is simply a feather in his cap even though it is in the upper tier of unlikelihood due to the high quality of opposition that he faced.

7. Dennis Martinez vs. Los Angeles Dodgers B-T Probability: 5.0 in 100,000

Perfect Game Martinez

Dennis Martinez led his Montreal Expos to victory on July 28th, 1991 throwing a perfect game against the LA Dodgers at Dodger Stadium. Martinez was one of the top pitchers in the AL in 1991 and as we are starting to see as we have reached the top 10 of this countdown had to work his way through a strong Dodgers lineup that finished 93-69 on the year under skipper, Tommy LaSorda.

In some cases these perfect games are very random events, pitchers get hot one day, come up against a poor lineup and can make it through all 9 innings unscathed. In other cases even though these events are extremely rare you aren’t entirely in awe by who got the job done. Pitchers that we’ve seen already that are of a more recent era like Roy Halladay and Felix Hernandez were no strangers to working long into games and leading the league in categories such as complete games and shutouts. That’s not to say you are bound to throw a perfect game if you get through 9 innings once a month but if you have the ability to put yourself in those sort of positions it might make it a little easier to run into a chance at a perfect game.

Dennis Martinez fits in the same category as Halladay and Hernandez in this respect. Leading up to 1991 which was arguably the best year of his career he threw 21 complete games between 1988-1990, then throwing 9 more in 1991 including his perfect game. He was no stranger to working long into games as he led the league in complete game in 1979 when he was an Oriole. He completed 18 games that year and 15 the year before in 1978. It wasn’t something he did regularly throughout his 23 year MLB career but in his best spurts from 1978-1982 and 1988-1992 he had no problem working deep into games. If there is an underlying predictive statistic for a perfect game it seems like being able to go the distance often has to be about the best one.

The Dodgers lineup that day was both peculiar and extremely solid at the same time. The top batters in the Dodger lineup were Juan Samuel, Eddie Murray, and Daryl Strawberry. It’s an interesting one because when you think of each of those players, at least for me, Dodger certainly isn’t the first baseball team that comes to mind. 1991 also fittingly enough ended up being the final year that each of them contributed in a star everyday role for a big league club. Eddie Murray went to the Mets and had some success but for this most part this ended up being the final year of the production that we remember them for. Mike Sciosia was also the Dodgers catcher that day.

Interesting lineup names aside, Dennis Martinez made quick work of the Dodgers only striking out 4 but retiring all 27 batters using just 96 pitches. Martinez led the league in ERA in 1991 and only gave up 9 home runs all year in 31 games started. His perfect game is a stellar achievement in what was a career year for the Nicaraguan-born Expo.

6. Phillip Humber vs. B-T Probability: 4.6 in 100,000

Perfect Game Humber

Below average pitchers throw gems from time to time, but rarely do they do what the White Sox’s Phillip Humber did versus the Seattle Mariners in 2012: throw a perfect game. Going by name and reputation alone, Humber might be the least likely of the players on this list to throw a perfect game. He was a journeyman who played on five different teams in eight seasons, and made nearly half of his career appearances in long and middle relief. Humber ended with a 6.44 ERA in 2012, and that’s counting his perfect game. Save for a solid 2011 campaign in which he threw for a 3.75 ERA that was actually worse than his 3.58 FIP, Humber was strikingly below average.

It is actually surprising how far down Humber is on the list, given his complete lack of name recognition and poor counting stats. Even his POBP numbers are awful; his .348 POBP in 2012 ranks as the highest of any pitcher to ever throw a perfect game. To put in perspective how unlikely Humber’s performance was: Humber averaged over one batter reaching base per inning on the season, but was able to limit the Mariners to no base runners through nine.

The only reason Humber doesn’t rank closer to the top of the list is the Mariners’ lackluster lineup. If you check the box score, you’ll see a cast of random role players from the late-2000s and early-2010s: Chone Figgins, Miguel Olivo, Munenori Kawasaki, Dustin Ackley, Jesus Montero, Michael Saunders. Future all-stars Kyle Seager and Justin Smoak were still early in their careers and experiencing growing pains, and Mariners legend Ichiro Suzuki was in the midst of his worst season in Seattle before being shipped off to the Yankees in the middle of the year. The absence of any truly good players, combined with the absence of any truly bad players, resulted in a middling lineup that was susceptible, but not too susceptible, to a perfect game.


References

Baseball Reference. Retrieved from https://www.baseball-reference.com/

Baseball Almanac. Retrieved from https://www.baseball-almanac.com/pitching/piperf.shtml

Fangraphs. Retrieved from https://www.fangraphs.com/

Image Citation

Catfish Hunter. Retrieved from http://www.thepostgame.com/blog/throwback/201505/remember-catfish-hunter-throwing-ninth-pefect-game-mlb-history

Kenny Rogers. Retrieved from http://shsports.blogspot.com/2014/07/texas-rangers-kenny-rogers-pitches.html

Randy Johnson. Retrieved from https://www.thescore.com/mlb/news/1972181

Dennis Martinez. Retrieved from https://montrealgazette.com/sports/remembering-martinezs-perfect-game-with-expos

Phillip Humber. Retrieved from https://sports.yahoo.com/news/philip-humber-most-obscure-perfect-182236269.html

Perfection Ranked: Greatest Perfect Games #11-15

Perfect Game Collage 2

by Mallet James and Kyle Kroboth

In now our second installment of a 4-part series where we attempt to rank all 21 perfect games in the modern era using a numbers only approach, we take a closer look at some of the middle tier likelihood perfect games. Some terrific names and stories are featured in this section including the legendary Roy Halladay, David Wells, and Jim Bunning. We introduced our ranking method using Bradley-Terry in our introductory post here.

 

15. Mike Witt vs. Texas Rangers B-T Probability: 9.7 in 100,000

Perfect Game Witt

On the final day of the 1984 regular season Mike Witt threw a perfect game in Arlington against the Texas Rangers. It was basically a moot game going in, both the Rangers and California Angels were out of the AL West playoff race and just looked to finish the season on a high note. Mike Witt did just that, putting on a performance that was certainly an octave or two higher than anyone might have expected to complete the season.

Witt was just 24 when he threw his perfect game, rounding off a season that was by far the best of his first four years in bigs and sparked some pretty solid years to follow in the mid-80s. By most statistical measures Witt had an above average year in 1984, he posted a 116 ERA+, and at the time a significantly above average 7.2 SO9 which would go on to be the best of his career.

Witt won his perfect game with a final score of 1-0, not a huge surprise because 1-0 is the most common final score of all perfect games. 6 of the 21 perfect games in the modern era ended 1-0.

Witt faced a Rangers lineup that was pretty subpar, a 69-91 team coming into the final game of the season. The core of the Ranger lineup was Gary Ward, Larry Parrish, and Pete O’ Brien each of whom posted an OBP of above .330 which would qualify as about 10 points above average in the 1984 season. No one else that stepped in the box for the Rangers touted above a league average On-Base Percentage. Not the toughest lineup Witt may have faced on the year and his .306 POBP didn’t hold him back as he K’d 10 batters while taking part in only the 3rd sub-two hour perfect game that would’ve been shown on TV.

Only 8,375 fans took in the game live in Arlington making Witt’s game the worst attended perfect game of the last 35+ years.

14. Jim Bunning vs. B-T Probability: 9.2 in 100,000

Perfect Game Bunning

The Phillies’ Jim Bunning threw his perfect game against the New York Mets on June 21, 1964. The Mets were in their third year of existence, and finished with a 53-109 record in the midst of a seven year span of futility. One would expect Bunning’s perfecto to be much further down the list; after all, we’re talking about a Hall of Famer going against the worst team in the league. In fact, the Mets’ lineup doesn’t include a single star and doesn’t even have many “oh, I remember that guy” players. But there were actually a few surprises in the Mets’ lineup that bumps up Bunning’s ranking.

Leading the charge for the Mets was second baseman Ron Hunt, who would go on to start the All-Star Game later that season at Shea Stadium. Hunt, despite a complete lack of power, hit .303 and got on base at an impressive .357 clip, good enough for 32nd in the MLB. At this point, Hunt had not yet adopted his famous strategy of crowding the plate and getting hit by as many pitches as possible, which might have boosted his OBP even further (or sent him to the Disabled List!). In 1971, Hunt was hit by a pitch an astounding 50 times, a record that no other player has come within 15 of before or since.

The Mets also got an impressive season out of outfielder Joe Christopher, who slashed .300/.360/.466, an even better clip than Hunt. Christopher was a surprise for the Mets that season after being largely underwhelming in the first two years after he was selected in the expansion draft. 1964 was his only year with over 10 home runs and an average above .300. Before the Mets, he was part of other (near) perfect game history, making his debut in behind Harvey Haddix in his 12-inning near perfect game in 1959. He also came off the bench in three games in the 1960 World Series against the Yankees, securing himself a World Series ring.

Bunning’s perfect game came after a long hiatus (first in the regular season since 1922) and was surprisingly the first National League perfect game thrown in the modern era. There is not much else especially unique or distinct about Bunning’s achievement; he was a great pitcher throwing to a not so great team. But the odds of such an accomplishment are still astronomically low and Bunning’s work that day needs to be appreciated.

13. Roy Halladay vs. Florida Marlins B-T Probability: 8.1 in 100,000

Perfect Game Halladay

The 2nd perfect game in Phillies’ history was thrown by Phils legend and hall of famer Roy Halladay. Doc threw the perfect game in Florida at Sun Life ballpark making it the 6th ever perfect game thrown on the road, only 7 of the 21 perfect games to date have been thrown away from the pitcher’s home park.

The Phillies Marlins matchup in early 2010 featured two Aces of the NL East at the time, Roy and everyone’s favorite forgotten major leaguer, Josh Johnson, who at the time was coming off of a 2009 all star campaign in which he posted a 15-5 record. Fans in attendance may have been hoping for a pitcher’s duel and they got just that.

Wilson Valdez scored an unearned run for the Phils in the top half of the 3rd off of a line drive hit by Chase Utley that was mis-played in center field by Cameron Maybin. It ended up being the only run scored in the game. The Phillies did have 7 hits in the game, 4 of which came from the 7 and 8 spots in the order (Jason Castro and Carlos Ruiz), they just couldn’t string them together to score many runs. Luckily the run they did score was more than enough for Doc to work with that day.

Now 9 spots into our perfect game rankings we are starting to find lineups that should have been tougher for pitchers to work their way through. The Marlins went 80-82 in 2010 but batters 2-6 in the Marlins order were all quite formidable, headlined by Hanley Ramirez, a very tough out at the time. 2010 was basically the end of Hanley’s prime as one of the best shortstops in the league, his OBP was .378 in 2010 and he had a wRC+ of 127 which he has only bested two other times in his career since. The likes of Jorge Cantu, Dan Uggla, and Cody Ross were also in the lineup, Ross was the only batter in the top 6 who Roy did not strikeout during his perfecto. It was quite a decent lineup with few no names, the pitching performances here on out will have required quite a bit more skill.

With a POBP of .269 at the time, Halladay ranks 6th best among pitchers having thrown a perfect game. Even though the lineup was tougher than those we have seen so far Halladay’s ability to keep runners off base at the time was some of the best of anyone to throw a perfect game. Halladay also may have been one of the best in recent MLB history at staying sharp through the entirety of 9 innings. Roy led all of baseball in complete games for 5 years straight from 2007-2011. His endurance and relentless nature kept him at the top of the game for many years, he had an extra gear that when kicked in not many others could match. He is one of only 6 players with both a perfect game and no-hitter to their credit, the only pitcher with both a perfect game and postseason no-hitter.

12. Matt Cain vs. Houston Astros B-T Probability: 7.5 in 100,000

Perfect Game Cain

The San Francisco Giants’ Matt Cain twirled the second of three perfect games in the 2012 MLB season in a weekday contest versus the Houston Astros on June 13, 2012. Cain’s perfect game is regarded as one of the most dominant in history, and is tied with Sandy Koufax’s 1965 perfecto for most strikeouts in a perfect game, with 14. The Astros were in the midst of a horrible season, with a 26-36 record in mid-June They would eventually finish 55-107, the second worst record of any team to fall victim to a perfect game. But if the lineup was so bad, why isn’t this game further down the list?

The Astros, though they were a bad team, had some big names still at the beginning of their career. Guys like Jose Altuve, J.D. Martinez, and Jed Lowrie, who have been big time contributors at various points through their career. Other quality contributors, like first baseman Carlos Lee (who did not play in the perfect game), third baseman Chris Johnson, and pitchers J.A. Happ and Wandy Rodriguez, were dumped for prospects at the July trading deadline. Put simply, this Astros team was not quite as bad as the record indicates.

The Astros major issues were in the outfield and behind the plate. As great as J.D. Martinez is today, he was nothing special in 2012, when his OPS was below .700 (the past few years, it has hovered around 1.000). The rest of the outfield was made up of Jordan Schafer and Brian Bogusevic, two light hitting role players who combined for a career bWAR below zero. Catcher Chris Snyder was equally bad, and pitcher J.A. Happ contributed a season OBP of .171. Happ was pulled for a pinch hitter after only one at bat; if he stayed in the game long enough for another at bat, Cain’s accomplishment would have been slightly more likely.

Cain himself was in the midst of an elite season, going 16-5 and twirling a 2.79 ERA on the way to his third and final All-Star appearance and first World Series ring. The Bradley-Terry model weighs his low (.271) POBP alongside the Astros’ relatively low team OBP, shifting the likelihood upward.

11. David Wells vs. Minnesota Twins B-T Probability: 7.3 in 100,000

Perfect Game Wells

Sunday, May 17th, 1998 held the 2nd perfect game in Yankee history. In a year that David Wells led all the majors in win percentage, posting a 18-4 record, he was no stranger to 9 innings of work. He threw 5 shutouts in ‘98 which was the most of any pitcher in the major leagues that year.

Wells was pitching very well coming into the game against the Twins and came up against a relatively average lineup that featured a 41-year-old Paul Molitor batting 3rd. Only one player in the Twins lineup that day would go on to post an OPS+ of over 100 that year, outfielder Matt Lawton. Lawton may have been seeing the baseball best that day against Wells at Yankee Stadium, as he stung a pair of fly balls and was one of only two Twins not to post a strikeout.

There was no dangerous on-base threat in the Minnesota 9 that day other than Lawton but it was an American league lineup that featured more quality than we’ve seen in the back end of most of the lineups so far.

Wells ran through a lineup that we would expect to see perfection against only 7 times in 100,000 tries, according to the Bradley-Terry model which is no small feat. Wells struck out 11 batters which left 10 flyouts and six groundouts to total the 27 outs while facing the minimum.

Maybe most impressively, Wells has been famously noted to have thrown his perfect game while hungover after a long night out in the Big Apple until somewhere around 5:30am. He rallied and toed the rubber at 1:30pm that afternoon; the rest is truly baseball history.

For a better view of visuals visit post here.


References

Baseball Reference. Retrieved from https://www.baseball-reference.com/

Baseball Almanac. Retrieved from https://www.baseball-almanac.com/pitching/piperf.shtml

Baseball Savant. Retrieved from https://baseballsavant.mlb.com/

Fangraphs. Retrieved from https://www.fangraphs.com/

Image Citation

Mike Witt. Retrieved from https://twitter.com/angels/status/517018568396144640

Jim Bunning. Retrieved from https://www.courier-journal.com/story/sports/mlb/2017/05/27/look-jim-bunnings-perfect-game-1964/351754001/

Roy Halladay. Retrieved from https://www.thegoodphight.com/2017/5/29/15706956/roy-halladays-perfect-game-was-seven-years-ago-today

Matt Cain. Retrieved from https://www.nbcsports.com/bayarea/giants/matt-cain-takes-playful-shot-giants-fans-over-perfect-game-attendance

David Wells. Retrieved from https://sports.yahoo.com/20-years-later-david-wells-perfect-game-seems-even-impossible-032903686.html

Perfection Ranked: Greatest Perfect Games #16-21

Perfect Game Collage

by Mallet James and Kyle Kroboth

In the first installment of a 4-part series where we attempt to rank all 21 perfect games in the modern era using a numbers only appraoch, we take a closer look at the most likely of the perfect games thrown in Major League Baseball since 1900. Greats like Cy Young and Addie Joss fall in the section along with lesser known names like Len Barker. We introduced our ranking method using Bradley-Terry in our introductory post here.

 

21. Cy Young vs. Philadelphia Athletics B-T Probability: 160 in 100,000

Perfect Game Young

Cy Young leads off our perfect game analysis fittingly enough, having thrown the first perfect game of the modern era on May 5th, 1904. If there was a single pitcher that you would bet on to have thrown one of these just based on his resume and number of appearances alone he would have to be in your top 3, if not number 1 all time. One of the greats of the game, he still holds the record for many MLB pitching counting stats that may not ever be broken. His 511 wins, 749 complete games (no one’s touching that one), and 29,565 batters faced, among many others are records that put him in a league of his own when it comes to pitching excellence and longevity.

Out of any pitcher on our list this one probably makes the most sense. He was an outstanding pitcher and did it for a long time, you would think he would have to run into one of these sooner or later. Then again, you might say the same thing about Nolan Ryan, who threw seven no-hitters but does not find his name among greats having thrown a perfect game. There is no doubt a great deal of luck involved in the feat of a perfect game, Cy Young is one of the few that was so skilled that the luck did not need to work quite as hard in his case.

Young’s POBP of .251 is one of the best out of pitchers on this list and he faced a lineup that featured 3 batters with deplorable sub-.200 OBPs. There were a few formidable opponents in the lineup that day for the Philadelphia Athletics team he faced, but it didn’t prove to be enough to fend off perfection. 37 years old at the time, Cy Young is still the oldest pitcher in MLB history to have thrown a perfect game. Unlike some others in the perfect game club, he will not be even remotely remembered for just having thrown a perfect game due to the many other accolades earned in his career. He holds the title of most likely perfect game thrown on our list, certainly not a bad place to be.

20. Addie Joss vs. Chicago White Sox B-T Probability: 155 in 100,000

Perfect Game Joss

The second perfect game in the modern era was thrown by Addie Joss on October 2nd, 1908. Ranking Joss and Cy Young’s perfect games is about as straightforward as it gets; their perfect games sit on a tier separate from all other perfect games not just because of the time they were thrown (first two of the modern era), but also because they feature a strong jump in likelihood well above any perfect game. Both pitchers are all time great Hall of Famers, who found perfection when throwing to weak lineups when, at the time, they were dominating even some of the best lineups that Major League Baseball could plate against them.

Joss was a force on the mound, posting one of the great major league pitching careers of all time despite injuries and illnesses limiting him to only eight years of MLB ball. He even found his way into the Hall of Fame, using a special exemption to the usual ten year service requirement having been inducted. He still holds the record for best career WHIP at an incredible .968. Joss in his prime could be compared to present-day Max Scherzer, who has led the league in WHIP four times. In 8 years and change Joss won 160 games, while Max Scherzer has pitched around 10 full seasons in the major leagues and has a current win total of 170. When accounting for the differences in pitcher uses between eras, those numbers even out. One thing that Joss could hold over Scherzer’s head is a perfect game. Scherzer came close against the Pirates in 2015, but lost perfection in the bottom of the ninth when outfielder Jose Tabata leaned into an inside breaking ball to draw an HBP.

Joss blanked a Chicago White Sox team that was fighting for a World Series berth at the end of the regular season. Looking back at their lineup that day, it doesn’t seem like they were much of a World Series team given that their lead offensive threat, Patsy Dougherty, was batting .285 on the year with an OBP of .369 and there were many guys in the lineup with sub-.300 OBPs even this late into the season in October. Joss only needed 74 pitches to get 27 outs that day, a record low amount of pitches among perfect games thrown. He only needed 3 strikeouts through 9 innings so it is safe to say he had his defense working behind him that day. Joss is the leader in POBP out of all pitchers having thrown a perfect game at a mark of .218, a staggering number that, when coupled with facing a weak lineup makes Joss’s perfect game quite likely relative to many of the perfect games thrown all time.

19. Len Barker vs. Toronto Blue Jays B-T Probability: 132 in 100,000

Perfect Game Barker

Len Barker’s perfect game on May 15, 1981 against the Blue Jays seems to be a story of a hot pitcher having his way against an inexperienced lineup.

You’ll see that many other lists produced with a similar idea of ranking perfect games, rank Barker’s perfect game at or near dead last in significance and relative quality. Len Barker is not all that memorable of a name in major league baseball pitching and he pitched to a very young Blue Jays lineup with no baseball names of real historical note. The eleven guys that saw a pitch from Barker that day were an average age of 25 years old. Four of them were younger than 23, including a then 22 year old Danny Ainge, much more well known for his success on and around a NBA basketball court than a baseball field. Barker ran through the lineup with extreme ease, never pitching a ball three while striking out 11.

By most accounts Barker put together a very average 11 year MLB career as both a starter and a reliever, posting a 74-76 career record along with a pitcher’s WAR of 0.3. After transitioning from the bullpen, Barker had a very nice 1980 season where he put together a 19-12 record and led the American League in strikeouts. His 3.29 FIP was .88 points lower than his 4.17 ERA that year, a common theme throughout his career where his FIP proved much better than his ERA, based probably on the fact that he struck out a lot of batters when he had his best stuff.

Barker started 1981 with a 3-1 record, continuing his success from the prior year. He ran through a Blue Jays lineup that a hot pitcher should have had little to no problem shutting down. The Jays lineup that he faced had an average OBP of .277 that year, 2nd lowest of any lineup to see a perfect game in the modern era. Barker’s POBP was .299 in 1981, above the .284 average POBP of pitchers with perfect games on their resume but certainly nothing to laugh at. Barker may not be a popular name on this list, but given the timing of his performance, when he was pitching at his best, it makes sense that his likelihood of finding perfection is near the highest on our list.

The perfect game was without a doubt the high point of Barker’s MLB career. He was an All-Star in 1981, put up a solid year in 1982, and then was traded to the Braves mid-way through 1983 and never really had the same success again. He signed a very large contract with the Braves after the 1983 season but the big man never found a way to dominate major league hitters with his power fastball game again. The peak of his career featured some strong performances, unfortunately it did not stick around for very long.

18. Sandy Koufax vs. Chicago Cubs B-T Probability: 120 in 100,000

Perfect Game Koufax

In September 1965, Dodgers ace Sandy Koufax threw a perfect game against the Chicago Cubs. The game is notable for its general lack of offense; Koufax’s opposing pitcher, Bob Hendley, held onto a no-hitter until the seventh inning and the only run that scored was the result of a walk, sacrifice bunt, and an overthrow on a stolen base attempt. Overall, only two runners reached base during the entire game, a major league record for a full nine inning game.

The lineup that Koufax faced was very top-heavy, with three Hall of Famers in Billy Williams, Ron Santo, and Ernie Banks. All three made the 1965 All Star Game, with Williams and Banks each putting up elite 7.7 bWAR seasons, while an aging Banks still mustered 1.9 bWAR. Despite this murderer’s row in the middle of the order, Koufax’s performance was still relatively likely compared to other perfect games (albeit still extremely unlikely as a whole) because the rest of the Cubs’ lineup was, to put it lightly, not good. In fact, each one of the remaining six hitters in the Cubs lineup was below replacement level in 1965.

Two Cubs starters, leadoff center fielder Don Young and left fielder Byron Browne, had been called up in the days prior and made their major league debut that night. Neither would ever make much of an impact– after 1965, Young would not play another MLB game until 1969, and Browne somehow led the National League in strikeouts in 1966 despite playing only 120 games, hopping between the majors and minors in the years following. Hendley, as good as he was on the mound, was equally terrible with the bat, never reaching base and striking out in 12 of his 17 at bats in 1965. He didn’t help his own cause against the Dodgers, as he struck out in both of his at bats before being lifted for pinch hitter Harvey Kuenn in the top of the ninth, who fittingly enough also struck out.

Hendley and Kuenn weren’t Koufax’s only strikeout victims that night. In fact, 10 out of the 11 batters that stepped into the batter’s box struck out at least once. Koufax’s 14 strikeouts are a record for a perfect game, and he is the only pitcher to strike out at least one batter in each inning of his perfect game.

17. Felix Hernandez vs. Tampa Bay Rays B-T Probability: 107 in 100,000

Perfect Game Hernandez

The most recent perfect game was on August 15th, 2012, the conclusion of a three-part saga of perfection that year featuring Humber, Cain and Hernandez. Felix Hernandez faced off against what was a weak Tampa Bay Rays lineup at the time albeit headlined by longtime Rays standout Evan Longoria.

Though the lineup he threw to was somewhat weak, the lineup that he relied on for run support was quite a sad story in itself. At the time the man leading the lineup in OPS and batting fourth for the Mariners was John Jaso, certainly less of a liability at the plate than behind it but still not a guy to hang your hat on to lead a major league offense. Other notable names in the Seattle lineup that day included Michael Saunders, Eric Thames, Justin Smoak, and the only current Mariner that took the field that day, Kyle Seager. Every name mentioned has swung the bat well in some part of their major league career but found a way to keep fans engaged in the final outcome that day with perfection in the making. The Mariners lineup was no stranger to perfect games in 2012 as we will touch on later, having gone through one of their own. There was only one run scored in the game, the Mariners won it 1-0.

Probably the most interesting note from this game is that it was the third time the Rays were the victim of a perfect game in a 4-year span. An unfortunate stretch that probably will not be bested by another major league team any time soon. The four batters that were in the lineup for all 3 perfect games include Longoria, Carlos Pena, Melvin Upton, and super-utility man Ben Zobrist. Zobrist is a guy that really has no business being on the short list of batters that have seen the most perfect games firsthand. His 13.6 Whiff%, 91.5 Zone Contact%, and 74.2 Chase Contact% are all well above major league average since 2015 and speak to how hard it is to keep him off base. Hitting a major league pitcher is tough for even those who see the ball best, over a small sample size of 9 ABs I guess it’s hard to be too surprised by anything.

A little bit about Felix, he had a POBP of .295, (allowed a runner on base about 29.5% of ABs) in 2012 which is about average among pitchers having thrown a perfect game, still a strong mark though among all major league pitchers. He struck out 12 in the game, the first 11 all swinging and then the game ended with a strikeout looking from Sean Rodriguez. Hernandez faced a Rays lineup with an OBP of .320 that year, slightly above the average .312 OBP among lineups that have seen perfect games. All in all, Felix’s bid was fairly average in the scope of perfect games, obviously any of the 21 is an extremely rare feat but nothing too out of the ordinary in this one. 27 up, 27 down, and King Felix picked up another spectacular honor in what has been a very nice career.

16. Tom Browning vs. Los Angeles Dodgers B-T Probability: 104 in 100,000

Perfect Game Browning

The Cincinnati Reds’ Tom Browning twirled a perfect game versus the Los Angeles Dodgers on September 16, 1988. Contrary to most other victims of perfect games, the Dodgers were actually a formidable opponent and even went on to win the World Series later the same season. They featured the eventual NL MVP Kirk Gibson along with contributors like Steve Sax, Mike Marshall, and John Shelby. However, the Dodgers won more so on the back of terrific pitching, with Cy Young winner Orel Hershiser and an elite bullpen contributing to the team’s combined 2.96 ERA.

While it’s challenging to make too much of the Dodgers’ offensive limitations given that they were later crowned the best team in baseball, they featured a few hitters that skewed Browning’s perfecto towards the bottom of the list. Notably, the Dodgers threw shortstop Alfredo Griffin in the leadoff spot, despite his .255 OBP at the time (that’s pre-Moneyball era for you!). Third baseman Jeff Hamilton further damaged the Dodgers’ fortunes with a pedestrian batting average of .236 batting average compiled with a complete inability to draw walks: only 10 in 327 plate appearances. Dodgers pitcher Tim Belcher threw a three-hit gem, but was not known for his abilities at the plate and was a non-factor on that front.

Browning’s perfect game was unique in several ways. Rain caused a two hour and twenty-seven minute delay to the beginning of the game, making it one of two perfect games to undergo a rain delay and also the perfect game with the latest first pitch, at 10:02 PM local time. It was the first and only perfect game thrown on artificial turf, as well. Browning, a quality but not elite pitcher, took an unusual number of no hitters deep into ballgames. Earlier in the 1988 season, he had a no hitter going into the ninth until Tony Gwynn broke it up with one out. The next season, he took another perfect game into the ninth inning until Phillies infielder Dickie Thon knocked a double into the right-center gap.


References

Baseball Reference. Retrieved from https://www.baseball-reference.com/

Baseball Savant. Retrieved from https://baseballsavant.mlb.com/

Fangraphs. Retrieved from https://www.fangraphs.com/

Image Citation

Cy Young. Retrieved from https://www.history101.com/may-5-1904-cy-young-throws-the-first-perfect-game-in-modern-mlb-history/

Addie Joss. Retrieved from https://baseballhall.org/hall-of-famers/joss-addie

Len Barker. Retrieved from https://www.neosportsinsiders.com/lenny-barker-reflects-back-perfect-game-35-years-later/

Felix Hernandez. Retrieved from https://grantland.com/the-triangle/27-perfect-things-about-felix-hernandezs-perfect-game/

Sandy Koufax. Retrieved from https://www.mlb.com/cut4/49-years-ago-sandy-koufax-threw-perfect-game-one-hour-43-minutes/c-93862524

Tom Browning. Retireved from https://www.cincinnati.com/story/sports/mlb/reds/2019/07/17/cincinnati-reds-150th-anniversary-tom-browning-perfect-game-los-angeles-dodgers/1762745001/

Perfection Ranked: Greatest Perfect Games

Perfect Game Probabilities

by Mallet James and Kyle Kroboth

Since 1900 approximately 200,000 major league baseball games have been played including playoffs through the 2019 season. During that time, 21 perfect games have been thrown, an incredibly rare feat. Pitchers with a perfect game on their resume range from the unqualified legends – Sandy Koufax and Cy Young – to the otherwise anonymous – Len Barker, Philip Humber, and Dallas Braden.

Previous analyses have attempted to quantify the likelihood of each perfect game. These studies often only look at it from either the pitcher’s perspective or that of the batter. To only consider the ability of batters implies their success would be consistent whether facing an ace or a replacement level pitcher with a much more limited skill set. The reverse is true if only pitching ability is considered.

A perfect game is earned by a pitcher who allows no player from the opposing team on base by any means. Since a regular length game takes nine innings with three outs per inning, a pitcher would face exactly 27 batters. The number of times a batter, b, comes to the plate in a game as n, with each at bat having a “success” (reaches base) or “failure” (does not reach base). Furthermore, we assume all his at-bats are independent with a constant probability of success p on each at-bat. Under these conditions, the number of successes in n at-bats for any batter b during the course of a game follows a binomial distribution. The probability of success for any batter can be calculated using the binomial formula shown in Equation 1.

(1) Equation 1

Since a perfect game occurs when no batters successfully reach base over the course of the game, P(X = x) in equation 1 is P(X = 0). As a result, Equation 1 collapses to the following for each batter:

(2) Equation 2

One last assumption that batters are independent of each other. With this defined, the probabilities from Equation 2 can be multiplied together for each batter to calculate the probability the pitcher throws a perfect game. From Equation 2, what needs to be determined is the probability of success, p.

Calculating Probability a Batter Reaches Base

One choice for p is some form of “on base percentage” or OBP. From the hitter’s perspective, this metric is found by taking the total number of times he has reached base and dividing by his total number of plate appearances:

(3) Equation 3

Alternatively, a metric can be defined to quantify a pitcher’s likelihood of allowing baserunners. A similar percentage to OBP, called “pitcher on base percentage” or POBP, can be calculated by Equation 4:

(4) Equation 4

This statistic measures the probability the pitcher allows a batter to reach base.

As mentioned previously, a fault in some previous studies conducted that set out to rank perfect games is that they did not look at the interaction between the lineup of batters and the pitcher. To account for that interaction, the probability of each perfect game was calculated using a Bradley-Terry probability model provided in Equation 5:

(5) Equation 5

Here OBP is the OBP for each individual batter in the lineup. The batting lineup for each of the 21 perfect games was pulled from Baseball Almanac. The OBP and POBP for each hitter and pitcher involved in the perfect games was calculated using Baseball Reference data for the season in which the perfect game was thrown. For the rare occurence where OBP for a player was zero, the player’s career OBP was imputed. To illustrate the method, the probability of the first modern-era perfect game, thrown by Cy Young in May of 1904, will bw shown as an example.

Table 1 shows the batting lineup, how many at-bats each player had during the game, his OBP that season, and the probability of not reaching base using the binomial formula in Equation 1. Notice that despite some batters having the same number of at-bats their OBP differs. These results show varying probabilities that a batter does not reach base; the higher the OBP the smaller the probability. Assuming each batter is independent of other batters, we calculated the probability of a perfect game by multiplying the probabilities shown in the last column. Based on the OBP for the lineup faced by Cy Young, the probability of a perfect game was 0.000093.

Table 1: Probability batter does not reach base using OBP

Lineup AB OBP P(X = 0)
Topsy Hartsel 1 0.347 0.6530
Danny Hoffman 2 0.329 0.4502
Ollie Pickering 3 0.299 0.3445
Harry Davis 3 0.350 0.2746
Lave Cross 3 0.310 0.3285
Socks Seybold 3 0.351 0.2734
Danny Murphy 3 0.320 0.3144
Monte Cross 3 0.266 0.3955
Ossee Schrecongost 3 0.199 0.5139
Rube Waddell 3 0.164 0.5843

As we previously stated, this probability only refers to the chances this particular lineup fails to reach base successfully, not accounting for the pitcher’s skill. To use the Bradley-Terry model, Young’s POBP must also be calculated. In 1904, Young recorded a POBP of 0.251. The probability, therfore, that he retires 27 straight batters that year is (1-0.251)^27. However, this doesn’t take into account the lineup faced. This probability would be the same for every game Young pitched that year. The probability a pitcher does not allow a base runner (PWINS) was then calculated using the Bradley-Terry model (5):

Equation 6

Here PWINS represents the probability a batter, b, does not reach base for any at-bat. For example, using the OBP for batter Topsy Hartsel, in Table 1 with Cy Young’s POBP of 0.251, Hartsel’s PWINS would be found by the following equation:

Equation 7

Applying PWINS to Cy Young’s game, the updated probabilities for each batter are listed in Table 2. Multiplying the probabilities in column 4, a new probability of 0.000160 was calculated for Cy Young throwing a perfect game against this lineup. Even though it seems like a small bump added given his skillset it is actually a huge factor in the rankings, taking him from what would be a middle of the pack perfect game to actually the most likely perfect game pitched out of the 21 in the modern era.

Table 2: Probability batter does not reach base using PWINS

Lineup AB PWINS P(X = 0)
Topsy Hartsel 1 0.6834 0.6834
Danny Hoffman 2 0.6951 0.4344
Ollie Pickering 3 0.7149 0.3651
Harry Davis 3 0.6818 0.3166
Lave Cross 3 0.7073 0.3538
Socks Seybold 3 0.6811 0.3157
Danny Murphy 3 0.7007 0.3440
Monte Cross 3 0.7379 0.4018
Ossee Schrecongost 3 0.7901 0.4932
Rube Waddell 3 0.8204 0.5521

The PWINS for all 21 perfect game pitchers was calculated using R and is plotted in the graph above. Of these gems, the least likely came from Charlie Robertson in 1922 (0.000009). Robertson’s perfect game illustrates the need to account for both the lineup and the pitcher’s ability. During his perfect game, he faced a Detroit Tigers team with a mean OBP of 0.3585, with 10 of 11 batters having reaching base at 0.300 or better, including baseball’s all-time career batting average leader Ty Cobb. Couple that with Robertson’s POBP which ranked second worst among the 21 pitchers and it is clear why his was the unlikliest of an already unlikely event.

In the coming weeks we will look at all 21 games ranked using PWINS and provide a refreshing look at some of the key facts, figures, and stats from each game.

The reader might also be wondering whether this analysis will cover games in which a pitcher was perfect through nine innings, but lost the perfect game in extra innings, which has happened twice in MLB history. Harvey Haddix (1959) and Pedro Martinez (1995) both were perfect through 9 innings but allowed a baserunner in extras and lost their bid at perfection. In perhaps the greatest game ever pitched, Haddix was perfect was perfect through not just 9, not just 10, not just 11, but through 12 innings! And he lost! Martinez gave up a double against the first batter he saw in the 10th but, unlike Haddix, left the ballpark with a win after the Expos gave him a 1-0 lead in the top half of the inning. Though arguably more deserving of recognition than some of the pitchers in the perfect game club, Haddix and Martinez will not be included in our initial article series.

There is one more name to put on the shelf as well: Armando Galarraga. Unfortunately, his perfect game is not recognized by MLB due to circumstances even more unfortunate than either Haddix or Martinez. As such, Gallaraga probably deserves an article all to himself.

Our first article in the series of Perfection Ranked will include the most likely perfect games, ranked 16-21, leading off with Cy Young’s perfect game in 1904 which serves as both the first thrown and likeliest perfect game out of the group. We will continue to countdown to the least likely of all perfect games, Charlie Robertson’s most unlikely bid at perfection in late April 1922.

Special thanks to Andy Wiesner for the guidance on statistics for the article.

For a better view of visuals visit post here.


References

Baseball Almanac. Retrieved from http://www.baseball-almanac.com/

Baseball Reference. Retrieved from http://www.baseball-reference.com/

Bradley, R. A. and Terry, M. E. (1952). Rank Analysis of Incomplete Block Designs: I. The Method of
Paired Comparisons. Biometrika 39 324-345.