A Summary of #OTTanalytics

This past weekend, Carleton University played host to the biggest Hockey Analytics Conference organized to date.  Michael Schuckers of St. Lawrenence
University and Shirley Mills of Carleton University hosted the event, and it was
a great success drawing talents from across the continent and across the eras of
hockey analytics.

Read on past the jump for a summary of all of the presentations.

Warm Up

Rob Vollman of Hockey Prospectus and the author of the
Hockey Abstracts opened up the conference with a talk on the Past, Present and
Future of Hockey Analytics.  Vollman
covered the history of statistics within hockey, from the introduction of +/-
in the 1940s, to the introduction of save percentages in the 1980s.  He focuses on the history of advanced
statistics on the internet from Ian Fyffe’s Puckerings in 2001 to now before
focusing on the future.

First Period

The first panel in the morning focused on hockey web data
resources.

David Johnson had presented “Hockey Statistics: From Raw
Data to WOWY” talking about the process he uses to turn the raw RTSS data into
his stats.hockeyanalysis.com website.  He
discussed the issues that he encounters and how he coverts raw game sheets into over a billion
data points.  He ended by showing off
his With Or Without You (WOWY) stats, discussing how Jakub Vorachek has always been very good.  He ended with the quote that these are not
advanced statistics, they are just being re-applied in new ways.

A.C. Thomas of War-on-Ice talked about his perspective on
how they build their website and how they go beyond the RTSS datasheets from
the NHL.  Thomas shows that they parse
data that comes from ESPN, and Sportsnet, which is likely being provided by
the NHL; from TSN for transaction information on when a player is not in the
roster and for USAToday for contract information.

I (Josh Weissbock) presented web resources on hockey
players and teams in non-NHL Leagues (CHLStats.com).  This talk was a quick overview of where the
data comes from, issues with the data, and what statistics we can derive or
estimate from it.  I then talked about
some of the prospect projects that we at Canucks Army have undertaken using this data, such as
predicting player success rates.

Second Period

The second panel after lunch was focused on new advancements
in statistics within hockey.  This was
the most math heavy segment.

Sam Ventura (the other half of War-on-Ice) talked about his
idea for Expected Goals.  With Corsi-For%
being the best predictor of Goals-For% for teams and Scoring-Chances For% being
the best predictors of Goals-For% for forwards, his idea was to weight shot
attempts with logistic regression to create an Expected Goals-For % for
teams.  The results seemed to work well
and so far met the three rules of hockey analytics “sniff test”:

  1. Buffalo has to be so bad they are off the chart
  2. Toronto has to be bad to draw negative reactions
  3. Detroit has to be somewhere in the top 5.

Alex Diaz-Papkovich had the presentation that drew the most positive reaction.  He is a Masters
student at Carleton University and used a Machine Learning technique known as
“Association Rules” to look at Corsi of individuals (for and against) and
determined the likelihood they are causing those results.  He used these on the Toronto Maple Leafs and
demonstrated that Tyler Bozak was bad and likely causing the bad results on the
ice, while Morgan Reilly and Jake Gardiner were a good defensive pairing.

Matt Cane went beyond Corsi and examined Weighted
Shots.  Cane brought up the joke that
Corsi is dead … which has some truth to it as we no longer look at raw Corsi
but rather variants such as relative Corsi or Scored-Adjusted Corsi.  Cane brings up Tom Tango’s weighted-Shots
For% which gives more value to goalis and expands upon it to create a
Scored-Adjusted-weighted-Shots-For% which seems to do even better at future
predictions.  The next step is SAwSF% on
players and involving Quality of Competition or Teammates.

Long Change

The third panel included bloggers from the three local NHL teams
and a recent project or story from their perspective.

Steve Burtch of Pension Plan Puppets looked at the effects
on changes of coaches in teams mid-season. 
With stable rosters these teams become natural experiments.  Under Peter Horachek, Toronto is performing at a much
higher possession rate, which looks to be a result of the coach finding more
optimal roles for his players.

Emmanuel Perry of The 6th Sens tracked an entire
season of blue line entry and exist data for the Ottawa Senators, trying to answer the Taylor Hall
question of “what makes your Corsi better.”  Perry tracked over 400 events each Senators game and noticed some interesting links
between the eye test and a given player’s numbers, trying to justify why certain players can pass the eye test while still posting poor numbers, with his main focus being on young defender Cody Ceci.

Andrew Berkshire of Habs Eye on the Prize had two main takeaways from his story on “P.K. Subban: Defensive Liability?”  Subban, like other offensive defencemen, has previously been classified as a defensive liability.  Berkshire theorized that this is usually a result of the
difficult job of colour commentators who have to make split-second analysis
that will appeal to a wide audience. 
They don’t see all the good that these defencemen do unless they are
paying close attention to the game all the time.  These broadcasters are quick to pick
up on that one bad turnover or bad play, which influences common perception to fall in line with pre-existing tropes.  Berkshire then went deeper into
Subban’s statistics and shows that despite the narrative that he’s a “defensive
liability” he’s one of the best defensive defencemen in the league, and he is even being
in a used to prop up the Hab’s weaker players. 
Berkshire ends by saying that good offence intrinsically make you
defensively good.

Third Period

The last talks of the day were from some of the longest tenured voices within the hockey analytics community, who spoke about their projects and views on the current state of affairs.

Tom Awad had a similar talk to Rob Vollman’s, looking back at what
we have done, how we are approaching the limits of what we can find with RTSS
data, and how we need to start answering the “whys,” but why we cannot do so until we
have Sports Vu-type data.  Awad praised all tracking projects currently underway, as they are helping us get the data we need to dig deeper into analyzing hockey.

Timo Seppa talked about his recent experience in applying
tracking and microdata at the NCAA level. 
He has been working with Quinnipac and looking at their blue line entry
and exit data and found some interesting results.  He found that Quinnipac’s top two right handed
defencemen looked best initially but their success seemed to be a result of the systems they were employing.  His take away was that it
is important to make sure you frame and ask the right questions in analysis.  He also found it frustrating for players to
enter the zone with possession and then dump it.

The final talk came from Michael Schuckers, who looked at how
teams perform at drafting compared to “the market” (aka CSS rankings).  Schuckers found that the teams can outperform
CSS rankings, but acknowledged the work that we do here at Canucks Army where we
demonstrated that even scouting teams are necessarily better than simple
statistics.  Schuckers emphasized that
you should be able to draft successfully in the first round and the value of strong scouting should come in the 2nd round or later. 

Conclusion

It was a great conference to learn a lot of new things and
to put faces to the twitter handles.  I want to thank Michael Schuckers and Shirley Mills for hosting the event and for inviting myself to speak.  You will be able to see the powerpoint slides for all of these talks here.  The next hockey analytics conferences sounds
like it will be in Washington DC in April, and then there might be one at the entry draft in July.

  • Ruprecht

    This is great, thanks for the summary. Pretty cool they have the presentations up there too. I have a terrible mind for stats unfortunately, but I love this stuff, so it’s good when people smarter than me translate the numbers to words. Very cool that you got to go and speak at this, nice work!