On Monday night, the EvolvingWild twins, Josh and Luke Younggren, discovered that the shot location data the NHL provides to the public has been grossly inaccurate so far over the beginning of this young season. This is a significant discovery and problem for fans and public analysts of the game who rely on metrics and models that use shot location data in their calculations.
Josh and Luke announced this discovery via twitter @EvolvingWild in a series of tweets.
This was the tip of the iceberg, which is one example of many that shows shot locations being recorded at a farther distance than they actually are. The twins note that the NHL said this goal was scored from 6 feet out by Anthony Mantha when we can see him poking the puck into the net as it sits loosely by Anton Khudobin’s pads.
This Ryan Johansen goal is another example where he pokes in the puck as it sits almost on the goal line. The NHL recorded this as a shot from 9 feet out.
The inaccuracy is clear in these two examples and also when we step back and look at the data league-wide.
In the graphic above we can see that there is a big difference in shots mapped around the crease. In the visualizations representative of the two previous seasons, many shots crowd the crease and are even in the blue paint, while shot locations from this season seem to be pushed back away from the crease.
From this Viz created by Bryan Bastin (@projpatsummitt) we can see the overwhelming amount of teams that are overperforming their xG, including the Canucks. In total, we find 24 out of 31 teams are scoring more goals than expected, according to
Evolving-Hockey.
When we combine the two video examples with this league-wide data, the problem is clear.
Why does this matter?
The inaccuracy is clear, but you may be left wondering what the big deal is. The reason this matters is that use of shot location data is integral for the growing analytic community in hockey made up of fans and public analysts like the Evolving Hockey twins who use shot locations to analyze xG, shot maps, high-danger chances and other visualizations.
I know for myself and other writers on this site, xG (expected goals) plays a large role in analyzing team and player performance during a game or over the course of the season. While shot locations have never been perfectly recorded, this level of inaccuracy in unfamiliar and significantly hurts those evaluations.
Besides members of the public, hockey teams are very likely to be using this inaccurate data as well. This is concerning because unlike us fans and armchair GMs, there are consequences to decisions made based on this data. These decisions may be related to ice time, team transactions and even future contract negotiations.
This may sound a touch overdramatic to some who aren’t as familiar with the role of shot locations in evaluation throughout the league, but I ask that you consider reading this
Hockey Graphs article if you’re curious about the importance of metrics like xG that rely on accurate shot location data.
For those unfamiliar, xG has been found to be the best predictor of future goal scoring, which is why it matters so much. While Corsi plays a role in evaluation, xG is viewed by some, including myself, as the more telling metric for performance.
Why is this happening?
As of now, there isn’t a definitive answer as to why this is happening. There have been many theories as to why the locations are being pushed back, but it’s only speculation at this moment. It’s also unclear if the league had even known before Monday night. It’s also important to note that this change in shot location tracking was never announced from the league.
The league has since acknowledged the problem and is likely looking into much API and software that is involved in tracking shot locations.
What will be done to fix it?
In a perfect world, the NHL would begin tracking shot locations as they have in previous seasons and go back to fix shot locations in past games played this season. Instead, websites that run public models like Evolving-Hockey, HockeyViz, NaturalStatTrick, and MoneyPuck that give us xG and other tools may have to adjust their models to somehow account for the increased distances.
In the coming days, it will be interesting to see how the NHL, Evolving-Hockey and the other public hockey analytic websites react to this discovery. On Tuesday night, ESPN’s Greg Wyshynski said, “The NHL tells ESPN that it’s investigating an apparent change in the way shot distance is being measured in its game summaries. This week, the hockey analytics community noted a dramatic shift in shot distance, including video evidence that shots in the crease were being measured as having been several feet away. This could corrupt advanced stats like expected goals, and goalie analytics like high-danger saves. After observing the videos, the NHL is looking at potential catalysts.”
Since that comment on Tuesday, Josh and Luke have noticed the first correct shot location recording that we have had all year on Wednesday night in the Toronto – Washington game.
According to them, the problem may be fixed, but we have yet to hear on the league who will surely comment on the matter if a solution has been reached.