Pens’ Victory Not a Repudiation of Analytics
By Marc Naples (@SuperScrub47)
As is tradition, the NHL offseason is a time for taking a step back and evaluating all things about the NHL with a little perspective. One of those things inevitably debated is the usefulness of advanced stats, or analytics. This is the most interesting debate to me, as it’s nerds doing something new, versus the old guard “who have played the game.”
The old guard wasted no time in starting the debate this summer.
Now everyone loves Ray Ferraro. His work as color commentator is excellent. I’m not aware of his nuanced thoughts on the usefulness of analytics in hockey, but let’s use that sentiment as proxy for the general old school thinking.
As Ferraro points out, the Pens were not a strong team in Corsi this regular season. In light of this, he makes a very large logical leap and points back to intangible qualities as important factors in the Pens’ success.
To respond to this thinking, a couple of myths about analytics must be dispelled.
First, there is no one magic stat. Corsi is a useful measure. I think of it like team batting average in baseball, or yards gained in football. It is a good indicator of which team is driving play at the root of the game, but it certainly doesn’t guarantee victory in a single game, or even a series of games.
If we really want to evaluate and forecast team success, multiple measures must be considered and contextualized. For instance, for some teams we need to consider blocked shots and perhaps consider use of Fenwick instead of Corsi. We also need to adjust raw Corsi or Fenwick for score situation, as we know of “score effects” such that a team with a lead will sit on it and let the opposition attack.
Furthermore, stat-heads are getting better at considering shot quality. A few sites now publish “expected goals” that factor in game situation, shot distance and angle, and other details about the shot to calculate probability of scoring. This improves prediction of success in the short term. (In the longer term, this is not as important because distributions of aggregate shots tend to normalize, and it is rare for a team to systematically get relatively more high quality shot attempts while also getting relatively fewer bad shot attempts in the long run when they start piling up hundreds or thousands of shot attempts). Last year, in fact, the Stanley Cup finalists Pens and Sharks were numbers one and two, respectively, in the NHL in score-adjusted expected goals from corsica.hockey.
We also know that playoff success is often better predicted by stats with a recency adjustment. Looking at stats like Corsi over the final six weeks of the season tend to be much better predictors of playoff success than full season numbers.
In short, raw, full-season Corsi is just a starting point for forecasting playoff success. Analytics is not about looking up one stat and calling it a day, it requires an intelligent, multi-faceted analysis.
A second myth is the degree to which short run results disprove the value of stats. Some say when a team wins despite losing Corsi it shows Corsi is nonsense, while some stat defenders call it luck. Neither is exactly right.
The fundamental premise of all statistical study is that outcomes in the short run are highly variable and unpredictable, but in the long run patterns emerge. What is really happening in a single game, or a seven-game series or even a month of games, is that we are within the range of variability. No statistical study, or any subjective old-school analysis, can predict highly variable results.
The more low scoring the game, as in hockey and soccer, the more unpredictable variation you get in wins and losses. Particularly in the playoffs, goals are relatively rare events, and a matter of inches or split-seconds. Even if we can say with absolute certainty that a team has a 60% chance of winning, it’s like flipping a rigged coin that should go heads 60% of the time. Forty percent of the time it will still land on tails, and runs of tails WILL happen. Do 100 coin tosses, you will almost certainly have more heads than tails with this rigged coin. Do only five or seven coin tosses, you’ll have a significant number of sets where there are more tails. Thus, fewer goals in a game means more games won by “undeserving” performances simply because there are fewer goal events for teams to prove their superior likelihood of scoring.
With this better understanding of analytics, analytics remains a much stronger tool than fundamentally subjective ideas like grit and experience. The problem is that “experience, grit, and smarts” are more like ex post facto rationalizations than measurable quantities.
When we see a “weaker” team win, there is an immediate rationalization that they possessed such qualities. Not to sound too postmodern, but if you can’t measure a quality before a result, but only assert it after the fact, you have to question if the quality ever really existed in the first place.
Surely the Chicago Blackhawks this year, the press’s favorite to win the West, would’ve been the top team in terms of intangibles before the playoffs were played. After a first round sweep, was everyone wrong and the Blackhawks did not in fact have those qualities, or were those qualities just not very valuable? Either answer is pretty damning to the idea of making forecasts and strategies based on perceived grit and experience. Ultimately, it’s just not very meaningful to say a team has an intangible quality. Objective analysis is based on the tangible, and the intangible is all too often a “hindsight is 20/20” rationalization.
Furthermore, intangible qualities like these are similar to “clutch” performers. Analysis after analysis across sports points to the fact that there is no such thing as clutch performers, or teams that have a talent to pull out close games. Sure, a team can do that for a while, or perhaps even a whole season, but bring that identical team back next season and their magical ability to pull out close games mysteriously vanishes. Again we must ask ourselves if we are making an ex post facto rationalization of their success, assigning them an intangible skill that didn’t really exist in the first place.
Besides, even if we concede that experience, grits and smarts are a totally real thing, their regular season statistics should reflect it. Their experience and smarts should be boosting their regular season Corsi already!
Also, an unspoken assumption in Ferraro’s tweet is the unique makeup of the Pens roster. Obviously the Pens are headlined by two of the best players in the league in Sidney Crosby and Evgeni Malkin. Like a football team that doesn’t get many yards but can snatch wins on miracles from players like Barry Sanders or Odell Beckham, or a baseball team that relies on some clutch hits from a few once-in-a-lifetime hitters, this is not a strategy that teams should try to (or are capable of) copying.
Did the Pens really have these powerful intangibles on their side this spring, or did they have a unique team structure and come out on the good side of random variability with:
- Timely disallowed goals
- Shooting percentages
- A miracle performance from their number two goalie, who’s had some terrible recent playoff performances
- Habit of going long stretches against the Predators and Caps without a single shot, then scoring on their first look
- Jake Guentzel having one of the best playoffs for a rookie ever
I’m gonna say these are unpredictable, one-off events–NOT repeatable strategies. We should be comfortable labeling Pittsburgh’s run as improbable, not throw out theory to accommodate an exceptional result or playoff season.
When you get down to it, the whole purpose of any type of analysis is to identify patterns and extract strategies to utilize in the future. It would be a foolish conclusion for NHL teams to say we need to copy the 2016-17 Pens’ formula of “experience, grit, and smarts.”
Ferraro’s tweet admits there is grey area between analytics and old-school grit. For me, that grey is simply a combination of random variability and imperfect theory. Indeed, even if we had perfect stats and theory, only God or the Flying Spaghetti Monster can see the future.
The stat nerds didn’t nail the playoff results this year, as they did so well last year. That’s not a reason to throw out analytics in favor of worshiping the old school hockey gods to curry their favor for mystical powers of “grit.” There are many factors that go into winning a Stanley Cup, many of which are inherently unpredictable. Working with what we have, however, a nuanced, critical analysis of analytics still looks like the our most powerful tool for forecasting and hockey “strategy.”