This is adapted from a post over at Lowetide.ca a couple of days ago, but as I can now polish and archive it here, I will ..
Here’s a first look at the shot quality heatmaps I’m generating.
The methodology for generating the heatmaps is this:
- Divide the rink into 2×2 ft buckets (the league location data is 1×1 but this is too fine a mesh IMO in that it means that each bucket has too little data in it. I’d prefer 4×4 but that is too coarse, so 2×2 is an OK compromise).
- Count every shot (by shot type) in each of those buckets. Data is for the last three seasons, about 140K+ worth of shots. The league captures five types of shots: slap, snap, wrist, backhand, and tip-in.
- Calculate the sh% from each of the buckets.
- Ignore buckets with < 30 shots.
- Calculate the mean, median, and stdev of the sh% for the buckets.
- Define ‘high danger’ locations as being any bucket where the sh% is > mean+std.
First, a heatmap showing the sh% for all locations on the ice (shot count > 30).
Let’s contrast this with the low-medium-high area that war-on-ice uses for its shot chance metric. The key thing to notice here is that the war-on-ice definition of ‘scoring chance’ does capture most, but not all high danger shooting areas. It also captures a lot of low danger shots
[In fairness, the war-on-ice metric does include temporal information, rush shots and rebounds in particular, that make it quite a bit more sophisticated than just a location based method. My main point here is not that their methodology is bad, it is that it could be improved by adding shot type and finer grained shot location data to the analysis].
By getting to this finer grain of information, you can see where the real danger spots are. What you should really notice is how noisy it is. The true scoring chance information in the resulting analysis is buried in an awful lot of noise. And that’s with a massive dataset (140K+ shots) behind it.
This should also make you wonder about previous ‘shot quality’ correlations with shot metrics data. The established thought process is that scoring chances and shot metrics align so closely (e.g. read this article by Eric Tulsky) that you don’t need to look at shot quality separately.
But there’s a problem.
If you take a small dataset (say, just a few thousand shots as e.g. Eric T, Vic F, and Tom Awad have done when they looked at the correlation between shot data and scoring chance data), you get even more noise than what I’m showing, way more noise than signal.
So when you take ordinary shot data, then layer in a highly noisy dataset, then run the correlation – surprise, you’ll see a high correlation. This would be true of any dataset where you perturbed the original dataset with noise.
In other words, previous analyses that have shown a high correlation between shots and scoring chances may in all likelihood have concluded the wrong thing. They saw a high correlation, but the correct answer was not “therefore shot data captures scoring chance data”.
The correct answer perhaps was “the scoring chance data was so noisy that it did not add to the shot metric data”.
Same result. Two vastly different conclusions.
2: Now look at chart 2, refined to include only the high danger locations.
This is the same dataset, but now I’ve applied a simple criteria filter: only buckets that have a high sh% (which as noted earlier I’ve defined as one stdev above the mean sh% for all the buckets from graph 1, which puts it at 10.3%).
This shape is quite a bit more constrained than “home plate plus arrow”. (It also highlights the fact that even with this much data, you’ll inevitably get data artefacts, like the three buckets sitting out in nowhereland, no doubt a result of low shot count and a few fluky goals).
Flukes excepted, you can now make a strong argument that shots from these locations ARE the shots that are truly the ‘dangerous chances’.
I would guess (and this is the hypothesis I will be testing) that if you only include this higher grade data in your scoring chances, the effect on shot metrics will be more significant and pronounced than when Tulsky or Awad ran their analysis (which each used just a few thousand shots, which is an almost trivial amount of data).
I won’t post the rest of the individual shot type charts again for the moment, but my intent is to go one step beyond this chart, because the picture becomes even more variegated and interesting when you look at the high danger locations by shot type.
As you’d expect, a slap shot from 25 feet is more dangerous than a backhand from 20 feet, but you can’t make that distinction from this chart.
But I can and I will!