Don’t use EVGA for evaluating defensemen. Just don’t!

There are a few folks out there (you might know who you are, depending on hibernation status) who keep insisting that EVGA/60 is a reasonable way to measure the effectiveness of a defender.

To which I say: it’s not. So stop doing that!

Those who fail to learn from history are doomed to repeat it

From the deep dark history of fancystats, we know that goal rates in general are a problematic statistic.

There aren’t very many goals in a game relatively speaking, so it’s generally a small sample size, even when you’re looking at an entire season.

Goals also happen (or don’t) for all kinds of chaotic reasons, and their incidence is dominated by goaltending effects (for good or for bad). So that also makes it a noisy dataset.

Small and noisy is not what you want in your statistical data. (Probably not your kids either, but you’re kind of stuck that way)

In fact, that’s a huge part of the reason why the idea of using shot metrics instead of goals to evaluate teams and players was developed in the first place. Shots and shot attempts turn out to occur way more often (= larger sample size) and are less subject to random variation (= less noisy).

Large and quiet are what you want in your statistical data. (Probably your dog too, but I’d bet you’re not that lucky)

EVGA is no different. At the individual player level, it is a small – tiny, really – sample dataset. It is very noisy, dominated by chaotic effects and goalies.  So it should be clear … that makes it a bad metric for evaluating defensemen.

Now … how can we demonstrate that?

The method in the madness

Here’s what I did:

  • I looked for two goalies on the same team that had both played a reasonable number of games (>30), but with significantly different EV sv%
  • I looked for one or more defensemen that had played in front of those two goalies for most of the season
  • By calculating a number of defensive stats for one of those defensemen, including EVGA, and comparing how they changed when the goalies changed, we can get a sense of just how volatile and goaltending-dominated a statistic is EVGA.

Take a quick look at highly distinct goalie tandems this season, and a couple stand out: Mason/Emery in PHI, and Kuemper/Dubnyk in MIN. Since MIN changed goalies midway, there are going to be some natural biasing effects over the season (Kuemper played early, Dubnyk late) that I figured I might want to avoid, so I went with the Mason/Emery duo. Mark Streit played almost the entire season, so he seemed like a good candidate for this case study.

For the record, Ray Emery’s 2014-15 EV sv% was 91.27, and Steve Mason’s was 94.37.

All the data for this study came from the awesome folks at war-on-ice.

Numbers: crunched and delicious

So how does that look? Mark Streit played 67 games last season in front of either Emery (24 games) or Mason (43 games). These were full games only, I dropped the partials to avoid having to deconstruct individual games.

Here are Streit’s averaged EV stats for those 67 games:

CF% SCF% SCA60 SA60 CA60
Total 47.6 49.2 28.21 33.06 59.37

Mark Streit overall gives up about 59.4 / 33.1 / 28.2 shot attempts, shots, and scoring chances against per 60 at even strength. (Or more correctly, his team does while he’s on the ice).

He comes up shy of sawing off on Corsi (47.6%) and better at scoring chances (49.2%). The EVGA/60 corresponding to this even steven performance is 2.33.

(NB – these stats are not cumulative, they are averages by game. I wanted to see average per game performance, one that weights each game equally, compared between the two, rather than a cumulative effect.  If you use a full season number, you’ll see something somewhat, but not likely very, different as it will weight each shot rather than each game the same)

Now, let’s look at those same stats for Streit in front of the PHI starting goalie, Steve Mason:

CF% SCF% SCA60 SA60 CA60
Mason 47.1 48.6 29.15 34.22 60.82

So … Streit when playing in front of Mason gives up a bit more in the way of EV shot attempts, shots, and scoring chances, and has a weaker CF% and SCF%. This makes sense if you consider that Mason, as the undisputed starting goalie, is more likely to face the tougher opposition.

But wait! EVGA tells us this:

EVGA/60 = 1.94

Despite looking slightly worse (but still pretty consistent) by every other metric, Streit has miraculously improved as a defenseman! He’s now giving up goals at a rate 16% less than before. Wow!

Now how about in front of Emery?

CF% SCF% SCA60 SA60 CA60
Emery 48.5 50.2 26.52 30.99 56.77

Everything is better under Emery. Small sample warning, wide error bars, etc. but better. Not radically better, but better. Fewer chances of every type against, and the puck is moving in the right direction to a greater extent.  Probably played all the Buffalo games!

What does EVGA say? (Ring-ding-ding-ding-dingeringeding!)

EVGA/60 = 3.04

Yikes! Despite doing better by every other metric, Streit has somehow managed to become a horribly poor defenseman, giving up fully 30% more goals per 60 than his average, and 36% worse than in front of Mason!

Damn.

Here’s what the stats look like overall when charted (I’ve adjusted the scale so that they are all similar, but the ratios are unchanged). The key thing to notice is that ALL of the other metrics are quite stable … except EVGA, which changes dramatically depending on goalie.

CF% SCF% SCA60 SA60 CA60 EVGA/60
Diff -2.8% -3.1% +9.9% +10.4% +7.1% -36.3%

What are we to conclude from this?

Either:

  • We assume that EVGA/60 is a valid measure of a defenseman. In that case, Mark Streit is an incredible defenseman in front of Steve Mason, but a terrible one in front of Ray Emery, and he manages to accomplish this manic depressive performance in a single season, with the same team, while giving up more shots and chances in front of Mason.

OR

  • We accept that EVGA/60 doesn’t measure defensemen, really at all.  Goalies, and a bunch of other chaos – yes. Defensemen? No. Unequivocally no.

Shall I drop my mike?

Note that this example was not cherry picked. It came to the top of the table for the reasons I noted above, so I ran the numbers and they are what they are.  Just for completeness, here’s the same numbers and charts for Suter/Kuemper/Dubnyk. You’ll note that CF%/SCF% remain relatively stable. The Wild did really tighten down their game later in the season, with a big drop in chances  against, though that applied both for and against, as you can tell from the CF% and SCF%.

CF% SCF% SCA60 SA60 CA60 EVGA/60
All 50.0 50.8 24.2 28.9 55.7 1.91
Dubnyk 49.4 51.1 22.8 27.1 55.2 1.60
Kuemper 50.8 50.6 26.0 31.2 56.2 2.30
Diff -2.8% +1.0% -12.3% -13.1% -1.8% -30.4%

Either way, you can see that EVGA remains by far and away the most volatile metric, hugely affected by goalies, and almost certainly a bunch of other stuff too.

So don’t use it for evaluating defensemen. Don’t do it!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s