Here’s what I’m working on:
– Data acquisition: converting, QAing, and publishing as a CSV the entire NHL Play by Play with shot locations rolled in from the NHL RTSS feed. Do all leagues publish multiple pages of data with no simple key (e.g. eventid) to allow easy combining of such data? Or only the NHL?
– Temporal Corsi … analysis based on time characteristics of shot data. My goal is to create a metric that combines rush shots, rebounds, and cycles to provide a measurement that gives the summary time information related to a teams shot metrics.
Since rushes and rebounds are currently being used in shot quality calculations, this naturally led to:
– Shot quality analysis, using location, shot type, and temporal Corsi measures. Shot quality, or rather, not accounting for shot quality, has always been a controversial aspect of shot metrics. It’s “unintuitive” to most hockey observers to say that shot quality doesn’t matter. There is work being done to measure and include shot quality. However, most current (e.g. war-on-ice) shot quality measures use location and time, but not shot type. This seems like a fatal flaw to me, and my data analysis will show why.
And a couple of mental exercises:
1 – Though I’m not a gambler at all, the data above makes it easy to create a shot-based simulation of every game and produce probabilities of certain outcomes.
2 – A Bayesian estimator of goalie quality (this has been done before, but it’s a way of getting my head wrapped around Bayesian statistics).