Goalies are not voodoo – if they make it to 60 or so NHL games!

This is the third in a series of related posts about goaltenders.

The motivation behind these posts is the recent interest the Oilers purportedly have in Cam Talbot – a goalie with a brilliant but short track record.

The question at hand: how risky a choice is he?

Scored on the First Shot?

To recap some of the previous looks on this topic, the first post looked at a handful of mostly successful goalies to ascertain how long it took for their career sv% to stabilize and normalize.  The conclusion from that study was that, although goalie sv% remains fairly volatile until typically 100 to 200 games into their career, the sv% of goalies at a point in their career comparable to Cam Talbot (about 50 games) actually provided a surprisingly good baseline for future performance.

This was not a rigorous study by any means, but it did provide a small measure of comfort that Cam Talbot was the real deal, but the high incidence of volatility and ‘dips’ also suggested that for him to have a bad (Scrivens- or Dubnyk-like) year next year would not be a surprise.

So if he’s the guy the Oilers go with, perhaps having him as a tandem with the so far rock-steady Michal Neuvirth might be a good idea.

WheatNOil is Clutch

The second post arose from the first post, as reader/guest columnist WheatNOil figured it would be wise to explore the flip side of the dataset, which is to look at goalies that made it to 50 games in the NHL but washed out.

And so came the first surprise: the number of goalies who make it to 50 games in the NHL and then wash out before hitting 200 games is an exceedingly small sample!  So much so that Wheat, who had originally planned to look only at cap-era goaltending from 2005 onward, had to extend his search all the way back to 1998 just to find 17 goalies in his sample set.   Seventeen!  In almost twenty years!

So we’ve already come to a surprisingly rigorous result: by the time a goalie plays 50 games in the NHL, it appears he’s already achieved something difficult and rare, and that the likelihood of subsequent long-term failure is low.

Again, the lesson regarding Talbot should be clear.

The other data point is that, of those 17 goalies, 15 had relatively lousy sv% (average .907) at the 50 game mark.  Only two looked good at the 50 game mark, and one of those left the NHL because of a contract dispute and went on to have a fine KHL career.  So just one real surprise out of the 17.

This one will make Oiler fans particularly happy (and by happy I mean sad and frustrated): that one failure was Jussi “Rebound” Markkanen!  He started at a solid .923 sv%, but then shortly after that went off a cliff.

Nonetheless, I think that this  is a very surprising result.  Given the myriad of ways in which goalies, or any player really, can fail (injury, personal issues, a misleading hot start), I did not expect the dataset to be so uniform.  I would have thought the 50 game sv%s would be all over the map.  They weren’t.  They were consistent and they were low.

So here’s our other conclusion: of the goalies who make it all the way to that difficult 50 game barrier and then fail, most of them telegraphed their failure.

NHL GMs reading this blog (all zero of you): don’t say you haven’t been warned.

And Now For Something Completely the Same

Now this third post is going to look at the broadest set of goalies comparable to Cam Talbot that we can reasonably procure: all goalies who started their careers after 2005 and made it at least to a point in their careers comparable to Cam Talbot.  We’ll look at what happened to them thereafter and learn what we can learn.

The underlying data for this work comes once again from WheatNOil, who synthesized the list of goalies, then pulled the war-on-ice data for them at 1,300 even strength shots (he calls this the “Talbot Criteria”, as this is approximately where Talbot is) and then for their careers.  Since the data source is war-on-ice, he also pulled their adjusted sv%, which gives a slightly fairer basis for comparison of these goalies.  This work could not have been done without his data wrangling blood sweat and tears.

Getting to 1,300 is tough

So first thing we’ll take note of is that since 2005, WheatNOil found that only 43 goalies have started their careers in the NHL and made it to 1,300 EV saves.   That number is somewhere between 55 and 70 games for a typical NHL goalie.  The full list is at the end of the post.

(As an aside, let me reiterate that I cannot say enough nice things about a guy like Wheat, who read my original post, had some questions, and then went out and did a bunch of work to further the analysis.  Just outstanding).

I went back and pulled all the goalies (heh heh) who started their careers and played at least one game during that same time period, and there were 145. Most of those played a few games and disappeared forever.  Of those that didn’t, at least a couple (e.g. Hammond, Markstrom) started their careers recently and/or haven’t played that much yet, so haven’t had a chance to hit 1,300 EV saves yet.   So for sake of argument, let’s reduce our comparative number to 140, to act as a best guess for correction.

So here’s your first data point: only about 30% (43 of about 140) of goalies made it to the 1,300 even strength shot mark.  The rest failed out before that point.

To reiterate the learning: the data you have about a goalie at about 60 games in the NHL gives you much more information than you might expect, if for no other reason than he’s already established that he’s in pretty elite company.

From 1,300 to a career

Now let’s dig a little deeper into what the 1,300 mark tells us about the rest of a goalies career.

First, the failure rate of our set of 43 goalies that have started since 2005 and made it to 1,300 EV saves: it numbers … one.  And to the continued delight of Oiler fans, that goalie is … Jeff Drouin-Deslauriers.

Other than that, every one of the remaining goalies is either still active in the NHL, or in a few other cases, made a decent career of it before packing it in.  And given what we know about the Oilers’ desperation for goaltenders at the time, it is highly unlikely that JDD would have made it that far with almost any other team.

I don’t know about you, but to me, that is an amazing result.  In the last decade, one goalie out of 43 made it to 1,300 EV saves and then failed out shortly thereafter.  Even that result shouldn’t really be a surprise – JDD’s sv% at that point was 6th worst on the list.  His adjusted sv% put him at third worst, followed only by our man Viktor Fasth, and career backup Curtis McElhinney.   JDD only got another 161 EV saves, maybe another 8 games, and he was gone.

What it tells me is this: if a goalie makes it to 1,300 EV saves in his NHL career and he’s still employed, he’s already proven, almost without exception, that he’s capable of being at least a backup as an NHL goalie.  If a goalie isn’t good enough, he’s going to fail before he even hits that benchmark.

This makes sense if you think about it. Sixty games is very close to a full season for a  starting goalie.  How many NHL teams can afford to give a lousy goalie a seasons worth of starts to prove himself?  As the Oilers well know, even half a season of terrible goalkeeping is enough to torpedo playoff hopes.

The flip side to that is that, if you have a good sv% at 1,300 EV saves, the likelihood of failure seems small indeed.

To get to 1,300 saves, to reiterate, tells you an awful lot about a goalie because it already means he is likely the real deal.

Now take that with a caveat.  There are likely exceptions to this rule that will be showing up quite soon.  Jacob Markstrom for example is not on our list (hasn’t hit 1,300 EV saves), but he’s not looking good.  But he’s young, maybe he’ll rebound in a big way.  Conversely, it seems highly doubtful that Viktor Fasth will return.  So maybe I’m underestimating the failure rate by saying its just one.   Could be two!

That said, I’d still peg those guys as outliers – the issue is not that the 1,300 mark is guiding us wrongly, it’s simply that as with any arbitrarily placed benchmark, there will always be some unusual border condition data points that make it just over that benchmark that shouldn’t have.  And in Fasth’s case, on a comparative basis his adjusted sv% at 1,300 stinks.  So he was already telegraphing what looks like will be his impending dropout.

The flip side is that if you look at where the goalies who are good to elite were at the 1,300 EV save mark – they’re all solid, especially when using adjusted sv%.  Talbot’s adjusted sv% of .937 puts him on par with Crawford and Rinne.

And of the top 10 goalies by this metric, the ones that form what is in effect Talbot’s peer group – Jonas Hiller, James Reimer, Tuuka Rask, Cory Schneider, Corey Crawford, Pekka Rinne, Braden Holtby, Semyon Varlamov, Carey Price – how many of them seem like they’ll be failures to you?  How many strike you as elite?

The Numbers – Crunchy and Delicious

At long last, here is the part my fellow nerds (both of them) have no doubt have been impatiently waiting for: the statistics.  Woo!

How well does sv% at 1,300 shots predict the rest of a goalies career?

The first thing I did to give some reasonable validity to the dataset is to look at just the goalies who’ve made it to at least 3,000 EV saves (around 120 games, give or take).  We already know the failure rate is extremely low, so to be clear, we’re not filtering this dataset based on success, just setting a standard so that we are looking at goalies with enough track record to draw some reasonable conclusions.

That brings the sample set down to 29 goalies – a little on the modest side, but also bear in mind that this sample size of 29 comes from a population of either 43 or 145 depending on how you look at it.  That’s a healthy sample.

I also had to do a little bit of jiggery pokery to generate the data set I’m going to look at.  WheatNOil pulled the sv% and adjusted sv% for the first 1,300 saves, and for the career.   But that’s a bit of an incomplete dataset, because when you’re comparing the two, you’re basically comparing “A” (first 1,300 saves) with “A+B” (rest of career, which includes the first 1,300 saves), and that will push the correlations to be artificially high.

But manually extracting the data for those 29 goalies for rest of career proved to be overly time-consuming, so I faked it instead!  I calculated “B” (sv% for rest of career) arithmetically.  Because the exact number of saves is not 1,300 but at least 1,300, this calculated sv% is not exact.  But it is very close, close enough for our purposes.

First up, how does the sv% change from the first 1,300 saves to the rest of the goalies career?  Surprisingly little.  If we look at the delta from first 1,300 to rest of career, here’s what the numbers look like:

At 1,300 saves:

Mean Sv% Median Sv% Stdev Sv%
0.9222 0.9220 0.0084

For rest of career (average total career shots for this sample set are 6,920):

Mean Sv% Median Sv% Stdev Sv%
0.9239 0.9254 0.0056

Bear in mind, these are even strength save percentages, so they look on the high side compared to the normally published save percentages.  But look!  The sv% of these goalies at 1,300 saves is remarkably similar to the sv% of the rest of their career.  I think it’s still fair to say that goalies are voodoo before they start in the NHL.  But once they’ve faced 1,300 even strength NHL shots, they really aren’t voodoo anymore.  You pretty much know what you’ve got.

Of course, you could argue that this change might average out to zero, but show such a wide variation across the sample set that the zero is misleading, but that isn’t true either.  If we look at just the sv% deltas (that is calculate the difference between pre and post -1300 sv% for each goalie), what you see is:

Average change in sv% +0.00177
Median change in sv% +0.00259
Stdev of change in sv% 0.00781
  • Biggest increase in career: +0.0169 (Cam Ward, .902 to .919)
  • Biggest decrease in career: -0.0187 (James Reimer, .936 to .917)

It should also be noted that James Reimer was the only significant decrease in the set, which makes sense since his initial sv% was a high .936 (the highest in the unadjusted dataset).

It should also be noted that of the change from 1,300 to rest of career, it was only Reimer who fell more than 0.01, but there were four who jumped more than 0.01.  Ten of the goalies fell, 19 increased.  By a two to one margin, sv% at 1,300 EV saves underestimates a goalies rest-of-career sv%.

You can validate the remarkable stability of this dataset by looking at the histogram:2015-06 svpct delta 1300 games vs rest of career figure_8War-on-ice also adjusts sv% to compensate for changes in difficulty of shots faced, which would equalize for goalies who started life behind, say, the Edmonton Oilers “defense” vs the LA Kings DEFENSE.

The same data as above is shown, but this time for adjusted sv%:

At 1,300 saves:

Mean Sv% Median Sv% Stdev Sv%
0.9301 0.9290 0.0063

For rest of career:

Mean Sv% Median Sv% Stdev Sv%
0.9302 0.9312 0.0050

2015-06 adj svpct delta 1300 games vs rest of career figure_7

(This converts to a narrower range but a more distinctly positive bias relative to the unadjusted sv%, plus there’s an interesting bifurcation that probably bears further investigation.  Volunteers?)

Average change in sv% +0.00007
Median change in sv% 0
Stdev of change in sv% 0.00611
  • Biggest increase in career: +0.0087 (Henrik Lundqvist, .928 to .937)
  • Biggest decrease in career: -0.0143 (James Reimer, .941 to .927)

You can see that using adjusted sv% actually stabilizes the dataset further.  The main difference here is that most of the goalies who looked lousy initially but had good careers turned out to have been playing behind tire fires.  Presumably this is why they got the chance to further their careers despite their initially pedestrian numbers.

And last, just for gits and shiggles, here are the correlations between sv% and rest-of-career sv%:

Adj Sv% Sv%
Correlation 0.43 0.44
p-value 0.021 0.018

To put that in context, these correlations are roughly on par with the predicted correlation you see with Corsi 5×5, which in a hockey context is quite high.  In layman’s terms, the sv% at 1,300 “explains” about 44% of the rest of the goalies career sv%. And when you consider how little these move overall, there’s not a lot of splainin’ required.

ADDENDUM: I got into a Twitter discussion with a fellow who took issue with the data above because of the correlation.

In technical terms, the correlation above is “Pearson r” (product-moment correlation), which tells you how much the career sv% changes when you change the sv% at 1,300 saves.  In this case, it’s about 43%.

The related calculation is the square of this number, typically written as r^2 (coefficient of determination), which in this case is about 20%.  You could say then that sv% at 1,300 only accounts for about 20% of the career, and therefore the rest of it is random, and the dataset is therefore not useful.

This is wrong.  This is a classic example of why you always dig into your data, why you always visualize your data, why you always try to understand your data and don’t just rely on a single measure.

In this case, I’ve referred to this data set as “remarkably stable”.  What I mean by that is that when you break the data up into groups, the membership in the groups doesn’t change much.  For example, if you broke this up by the median, only four goalies switch groups (and one of them goes from the bottom of the top to the top of the bottom).

If you divide them into tertiles (call them elite, average, backup), then again you find that very few goalies switch tertiles.  NONE move two tertiles.  In other words, elites can become average or vice versa, backups can become average or vice versa, but in this group, an elite never becomes a backup, and vice versa.

You can create a dataset where the tertile, or quartile, or whatever, is a perfect predictor of membership in a following quartile, but has zero correlation.

Zero correlation but a stable predictive dataset nonetheless.   In my Twitter discussion, I pointed this out: the reasonable correlation and high significance are reflecting the stability of the dataset.  The r^2 basically is telling you that the precision of the mapping is not as high as you’d like it to be.  It’s there, but it’s in wide bands.  Hence … tertiles.

That’s why you take the time to understand your data.  That’s why you don’t rely on a single measure!

You can read more about the relevance (and misuse) of r^2 here: http://blog.minitab.com/blog/adventures-in-statistics/regression-analysis-how-do-i-interpret-r-squared-and-assess-the-goodness-of-fit

Conclusion

The conventional wisdom, that you need hundreds of games before you know what a goalie is going to be, is wrong.

By the time you have 1,300 even strength NHL shots worth of data on a goaltender, you have a surprisingly reliable predictor of the rest of that goalies career.  Especially if you use adjusted sv%.

This is not a result I was expecting when I first started investigating!

A very large part of that is simply that getting to 1,300 shots is an achievement unto itself.  About 70% of goalies who started in the NHL since 2005 failed to reach that mark.

One thing you’re probably wondering about is … what about Ben?  Well, by this measure, he was a pretty darn good goalie early in his career.  He fought the puck a lot and had a terrible year last year – but that’s not unusual based on the volatility in the early career of a number of the goalies that I looked at.  A ‘dip’ was not uncommon.  Maybe the pressure of being a starting goalie for the first time?  Ben’s a smart guy, a good guy, a fighter, and wants to win.  I’d bet on him having a rebound next year.

Unless of course the Markkanen and JDD experience, both Oilers and the two glaring exceptions to some very strong rules, represent some sort of Oiler curse.  Nah, can’t be.  If there was a curse, the draft lottery proved its gone, right?  I’d say bet on a Scrivens rebound.

And as for Talbot?  He’s achieved a sv% of .933 and an adjusted sv% of .937 in a little over 1,300 EV shots.  He’s tracking to be elite.  Out of goalies who’ve made it as far as he has since 2005, and given his peer group in this dataset, even for him to turn out to be “just” an average NHL goalie would make him a significant low outlier.

As long as the price isn’t too high, getting Talbot is a smart bet. Eddie Lack too.

One (Non-)Caveat

By the way, in a recent Lowetide thread on the applicability of this data to a guy like Cam Talbot, the issue was raised as to whether Talbot’s age was a concern.  I do not think it is.  By eye, the starting age for most of the goalies in the sample set was typically mid-twenties.  Goalies who started earlier were not that numerous.  Furthermore, those who started early did not typically play very much as youngsters, and often reached the 1,300 save benchmark typically around the age of 26 or 27 – very much the same as Talbot.

So I’d say that this dataset is quite representative and applicable to Talbot, and actually may fit less well with younger goalies.  Though if someone wants to actually go in and do the work of running the numbers on age to either confirm or disconfirm the eyeball test, I would certainly encourage and welcome that.

Data Tables

The full set of 43 goalies (plus Cam Talbot), with sv% and adjusted sv% at (at least) 1,300 EV shots (sorted by adjusted sv%).

Name Sv% @ 1300 Adj Sv% @ 1300
Jonas Hiller 935 943
James Reimer 936 941
Anton Khudobin 934 940
Tuuka Rask 932 939
Cory Schneider 927 938
Cam Talbot 933 937
Corey Crawford 929 937
Pekka Rinne 928 937
Braden Holtby 935 935
Semyon Varlamov 931 935
Carey Price 927 934
Jhonas Enroth 928 933
Thomas Greiss 928 932
Jimmy Howard 925 932
Steve Mason 925 932
Ben Scrivens 927 930
Sergei Bobrovsky 922 930
Niklas Backstrom 923 929
Mike Smith 920 929
Devan Dubnyk 917 929
Josh Harding 922 928
Eddie Lack 921 928
Henrik Lundqvist 916 928
Al Montoya 914 928
Jonathan Bernier 922 927
Michal Neuvirth 921 927
Jaroslav Halak 919 927
Jonas Gustavsson 916 927
Ben Bishop 921 926
Anders Lindback 917 926
Justin Peters 911 926
Jonathan Quick 920 924
Robin Lehner 919 923
Ondrej Pavelec 911 923
Cam Ward 902 923
Brian Elliott 915 921
Antti Niemi 914 921
Darcy Kuemper 919 920
Karri Ramo 911 919
Joey MacDonald 906 919
Peter Budaj 904 919
Jeff Deslauriers 910 918
Viktor Fasth 909 917
Curtis McElhinney 908 917

The 29 goalies (>3,000 career shots), with calculated remainder of career and deltas, changes greater than 0.01 highlighted:

Name Adj Sv% @ 1300 Adj Career Sv% Adj Sv% Delta Rest of Career Adj Sv% Rest Adj Delta Career Shots Sv% @ 1300 Career Sv% Sv% Delta Rest of Career Sv% Rest Delta
Jonas Hiller 943 935 -8 934 -9.4 8730 935 928 -7 927 -8.2
James Reimer 941 931 -10 927 -14.3 4333 936 923 -13 917 -18.6
Tuuka Rask 939 939 0 939 0.0 7077 932 934 2 934 2.5
Cory Schneider 938 938 0 938 0.0 4543 927 930 3 931 4.2
Corey Crawford 937 932 -5 931 -6.0 7495 929 926 -3 925 -3.6
Pekka Rinne 937 932 -5 931 -5.8 9256 928 928 0 928 0.0
Braden Holtby 935 934 -1 934 -1.4 4782 935 931 -4 930 -5.5
Semyon Varlamov 935 934 -1 934 -1.2 6764 931 926 -5 925 -6.2
Carey Price 934 934 0 934 0.0 10932 927 928 1 928 1.1
Jhonas Enroth 933 929 -4 926 -7.0 3017 928 923 -5 919 -8.8
Jimmy Howard 932 932 0 932 0.0 8160 925 926 1 926 1.2
Steve Mason 932 925 -7 924 -8.4 7783 925 921 -4 920 -4.8
Sergei Bobrovsky 930 932 2 933 2.7 5287 922 928 6 930 8.0
Niklas Backstrom 929 927 -2 927 -2.4 8642 923 922 -1 922 -1.2
Mike Smith 929 931 2 931 2.3 8874 920 923 3 924 3.5
Devan Dubnyk 929 930 1 930 1.3 5376 917 922 5 924 6.6
Josh Harding 928 929 1 930 1.7 3107 922 925 3 927 5.2
Henrik Lundqvist 928 936 8 937 8.7 15212 916 929 13 930 14.2
Jonathan Bernier 927 931 4 933 6.0 3848 922 924 2 925 3.0
Michal Neuvirth 927 926 -1 925 -1.5 3777 921 920 -1 919 -1.5
Jaroslav Halak 927 932 5 933 6.0 7615 919 925 6 926 7.2
Jonas Gustavsson 927 921 -6 917 -10.3 3101 916 911 -5 907 -8.6
Ben Bishop 926 932 6 935 9.0 3943 921 924 3 925 4.5
Jonathan Quick 924 931 7 932 8.1 9685 920 925 5 926 5.8
Ondrej Pavelec 923 927 4 928 4.8 7654 911 918 7 919 8.4
Cam Ward 923 927 4 928 4.5 11637 902 917 15 919 16.9
Brian Elliott 921 924 3 925 3.9 5704 915 917 2 918 2.6
Antti Niemi 921 928 7 929 8.2 8669 914 924 10 926 11.8
Peter Budaj 919 921 2 922 2.6 5680 904 913 9 916 11.7
Advertisements

15 thoughts on “Goalies are not voodoo – if they make it to 60 or so NHL games!

  1. You should put Talbot into that first data table. It really highlights how good he’s been in comparison to the whole data sample!

    Like

  2. Excellent series of posts.

    One small concern I would have with Talbot: looking at the rest of career adj sv% delta in the last table there seems to be a bit of regression towards the mean happening. For the most part the guys with the highest adj sv% after 1300 shots are the one who drop a bit over their career and the guys on the lower end after 1300 see their sv% improve more. There are a few exceptions (Rask and Schneider on the high end and Gustavsson on the low end) but I wouldn’t be surprised to see Talbot’s adj sv% drop a few points and definitely wouldn’t count on it going up.
    That being said even if it does drop I think he is still a good bet and will be pretty excited if the oilers do puck him up.

    Like

    1. Yes, that’s a fair point. That said, one of the things that I looked at (small sample) in the first post was the career trajectory of sv%. It was actually pretty common for goalies to start really high, correct downward in a big way, and then improve. I believe that’s why you see the 50 game sv% as slightly underestimating the career sv%. But in the end, we’re still talking probabilities, not certainties.

      Like

      1. The presence of what may be a somewhat predictable dip after the first 50 games is intriguing and, to me at least, somewhat surprising. It may be worthwhile to see if that same pattern holds up with a larger sample, or see if there are any other development patterns to be found by looking at sv% by games played.
        I’ve seen studies looking at aging curves but I wonder if a gp curve would be more telling for goalies. As most goalies come into the league a little later than players(at least I think they do) they are more likely to be near their physical peak and therefore less likely to see improvements from getting bigger or stronger. The improvements then would come from experience and the mental side which would be more affected by games played (or shots faced) than by age.
        Just a theory in my head right now but I may run some numbers this week to see if there is anything there. I suspect games played to be a bigger factor early in the career and age to be a bigger factor in the decline.
        Anyways this isn’t directly related to the excellent work you and wheatoil have done but one of the concerns I’ve seen about Talbot is that he is older so his numbers may not be as reliable and this may assuage (or intensify) those concerns.

        Like

    2. Further to G Money’s response, a good reference point is an even-strength save percentage of around .922 – .924. If you’re above that range, you’ve an above average starter in the NHL. So Talbot has some buffer room if he does drop a bit.

      That said, nothing is certain, of course. It’s always a bet.

      Like

    1. He didn’t make the cut. So far he’s only faced about 500 shots (all situations) in his career, so he’s got a ways to go to make the benchmark that we used of 1,300 even strength shots. Given his results (.915), I’d certainly say it looks more favourable than not, but early days for sure.

      Like

  3. Weighing in on the stats debate, ircc it is correct that the relationship only explains 20% of the variance in sv%. If one wanted to know the precise value, this would not be a reliable predictor. However as you point out, setting up three groups provides further insight – quite interesting I might add – but that is a separate analysis from the relationship itself, another processing step, if you will. The relationship really does not apply in that situation.

    Like

    1. Precision is the correct issue at hand. When you have a low r^2, it is less an indication of the model, and more an indicator of the *precision* of the model’s predictions. Significant p means the model’s relationship is strong, but low r^2 means the prediction is not at all precise.

      But that’s right on point here – at no point am I trying to use the 1,300 sv% to predict a value for the career sv%. The r^2 doesn’t support that.

      Rather, the only predictions I’m making here are categorical – e.g. if a goalie is elite/top tertile at 1,300, then it is highly likely he’ll be top tertile i.e. elite in his career. In essence, a very broad prediction – but one fully supported by both the correlation and the significance of the correlation of the relationship.

      Hopefully, that makes sense.

      Equally to the point, hopefully it makes sense in the context of my earlier addendum, which was to emphasize the importance of looking at p, r, r^2, and visualizing the data as well – not just relying on one number. (Strictly speaking, I suppose I should have visualized the residuals too!)

      Like

      1. You were clear in outlining your approach. Looking beyond the equation is always preferred. One needs to understand what is driving any relationship. You demonstrated that by parsing the data and unearthing some strong evidence on goalie performance over time. Great stuff.

        Conversely, great stats can arise from a questionable equation. The classic example is the one where the old USSR economy was highly correlated to the Black Sea water level. It followed that if the water level was manipulated lower it would trigger an economic collapse. Sometimes stats are voodoo and those that accept (or reject) them at face value are foolish.

        Liked the idea of statistically examining a specific question. Looking forward to more.

        Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s