13
Photo Credit: David Banks - USA TODAY Sports

Prospect Statistics Have Moved Far Beyond Looking at Point Totals

I, like many hockey nuts across Canada, have a subscription to the Athletic. It’s one of the few sports publications I’ve been willing to pay for, and that has a lot to do with their lineup of writers. Analytically oriented writers Tyler Dellow and Dom Luszczyszyn played a major role in that, not to mention that them landing Corey Pronman allowed me to discontinue my ESPN Insider subscription. Another major factor was the hiring of Justin Bourne, whose most recent gig was as a video coach for the AHL’s Toronto Marlies.

What got me excited about the Bourne addition is that his focus area has previously been on scouting and systems analysis, which is an area that I’d like to improve in. I was genuinely obsessed with the systems articles that he used to produce for The Score before taking the job with the Marlies in 2015.

While I’m still looking forward to what Bourne has to say about hockey strategies and video analysis for both NHL and non-NHL players, one of his recent articles for The Athletic suggests to me that his knowledge of prospect statistics and analytics is lagging a bit behind. The piece, titled “When Evaluating Players, Gauging Opportunity They’ve Been Given is Crucial” laments the lack of available statistics for prospects beyond raw points. (Paywall, obviously.)

What you’ll almost always look at first, is points. As you move down the ladder away from the NHL, it makes the most sense to start there, as they’re basically all we have to look at. For better or worse, we’re stuck with points when it comes to player evaluation.

As someone who has spent the last couple of years developing statistical models for prospects, and as someone who learned from my predecessors on this site that were doing it years before me, this strikes a chord with me. Essentially, it feels like I’m paying for an article to tell me that the work that I’ve been doing for the past two years doesn’t exist, which, needless to say, is an odd feeling.

Without a doubt, prospect statistics has moved far beyond looking at mere point totals, and it did so quite a while ago. Today I’ll be looking at the history of prospect analytics: how far they’ve come, and where they’re going. I realize that the regular readership at Canucks Army probably doesn’t need any convincing of this, as they’ve seen endless posts on prospect data already. But apparently that hasn’t extended all across the league yet. I don’t know how many people in the industry are on the same page as Justin Bourne, but for all those folks in the hockey community that aren’t aware, it’s time to bring you up to date on the world of prospect analytics.

The Foundation

For those, like myself, that came onto the hockey analytics scene long after it started, it seems like a great many of the early advances were achieved by a small number of people. Names like Benjamin Wendorf, Rob Vollman, Iain Fyffe, and Gabriel Desjardins dominate the oldest references, and prospect analysis is no different.

One of the earliest forms of statistical analysis of prospects is The Projectinator, a model created by Iain Fyffe for Hockey Prospectus way back in 2009. In the original post, Fyffe began to explore the idea of era adjusting Canadian Major Junior statistics. Era adjustments had been around for quite a while. Every hockey fan at some gets into a debate about how many points Wayne Gretzky would score nowadays, or how many points Sidney Crosby would have scored in the mid-1980’s. Era adjustments – performed in their simplest manner by dividing the current goals per game rate by that of any desired era, and applying the resulting coefficient to the player in question’s point rate – allowed us to compare the 1980’s and the 2010’s objectively. The same can be done for prospects, so long as you can capture the data.

Up and Coming became a regular series on Hockey Prospectus, and Fyffe continued to evolve his ideas. Over the course of that first year, Fyffe explored concepts that would eventual become integral to prospect analysis, such as height bias, within-year age, and using comparables, as well as taking numerous cracks at projecting players, and even goaltenders. The Projectinator has reappeared numerous times over the years, and had its very own chapter in Rob Vollman’s guide to hockey analytics, Stat Shot (which you absolutely should own a copy of).

A couple of years later, Gabe Desjardins, then of Arctic Ice Hockey, used a cohort of OHL players born between 1965 and 1985 to estimate approximate likelihoods of success for Mark Scheifele and Sean Couturier. He also brought visualization to the table with this wonderful chart.

The Legacy of Canucks Army

Canucks Army has been at the forefront of prospect analysis for quite some time, going back long before I began to write here. Much of the original work was done by former editor Rhys Jessop (now employed by the Florida Panthers). His work was built upon by former Jets Nation site editor Garret Hohl (now of HockeyData Inc.), and their work became the foundation for PCS, the brainchild of Cam Lawrence and Josh Weissbock (now both of the Florida Panthers), and the predecessor of our current cohort model, pGPS.

The 2014 draft was huge for the Canucks, who were picking 6th overall, their highest draft slot since selecting Cody Hodgson 10th overall in 2008. There was a lot on the line for the Canucks, and the members of Canucks Army at the time were highly involved in their own search for the best prospect available at that position. It was at this time that Jessop applied some fairly basic statistical principles to publicly available data and ventured into the realm of adjustments.

Moreover, Jessop introduced age adjustments as an attempt to account for the apparent discrepancy in ability between the average October born player (very early for their draft year) and August-born player (very late for their draft year). In leagues like the CHL, which span approximately five years of ages and include major development years, the differences in a single calendar year can be grand. They can also be quantified.

After exploring the extreme inefficiency of the Canucks drafting under Ron Delorme, and after demonstrating that a made-up summer intern constrained by the simplest of rules could, in theory, outperform many of the NHL’s scouting departments, Jessop kicked things up a notch, following Fyffe on the path of assessing players based on what similar players had achieved in the past – but with a new degree of accuracy.

Cohorts are used in all manner of social sciences, predicting future behaviour based on studies of what has happened before. Applying such a concept to sports seems like a no-brainer now, but at the time it was truly ground breaking. Cam Lawrence, along with enigmatic programmer and prospect junkie Josh Weissbock, turned Jessop’s experiments into a full scale statistical model, complete with Euclidean math, and testable results. The system, in addition to the promise of what it could become in the future, opened the door for them to work for an actual NHL front office, and go from writing about prospects to actually playing a role in drafting some of them.

Modern Prospect Analytics

It’s been a couple of years since we lost PCS, and I’ve done my best to fill that void with my own prospect cohort model. pGPS has been used hundreds of times on this website since being unveiled in the Spring of 2016, including in the Nation Networks prospect rankings for both the 2016 and 2017 drafts, and for the vast majority of prospect articles published here since then.

pGPS barely resembles the last public incarnation of PCS, or the first public version of pGPS for that matter, apart from the very foundation of the concept and the use of Euclidean distance. There is nothing in the formula that hasn’t had adjustments applied to it, and what used to be a three-factor formula now contains nearly three times that many, all with various weights applied to them.

But I think the biggest advancement I’ve made has been visually. The bubble charts generated by the pGPS program have proven quite popular, as have others such as year-to-year progression charts and cohort line distributions graphs.

Of course, pGPS is just one way we look at prospects here. Our arsenal of prospect statistical models now also contains SEAL adjusted statistics (adapted and improved from Garret Hohl’s original concept, with his permission and guidance), and the use of game sheets for a variety of purposes, including on-ice data, ice time estimation, rate stat estimation, teammate adjustments, quality of competition, and more.

And that’s just us at Canucks Army. We can’t talk about modern prospect analytics without making reference to prospect-stats.com, currently the most comprehensive free database for prospects from the CHL, the USHL, and the AHL. Then there are those who are out there performing research on prospects en masse, like Namita from Hockey Graphs, or on a more individualistic basis, like Josh Khalfin from Blue Seat Blogs.

Assessing and Accounting for Opportunity

While it should be incredibly clear at this point that prospect statistics have evolved far beyond mere points, we haven’t yet addressed the other point of Bourne’s article: the need for assessing and accounting for opportunity.

I agree with everything that Bourne says about the nuances of opportunity and how they relate to fluctuating point totals, which include but are not limited to:

  • Ice time,
  • Quality of teammates,
  • Power play time,
  • Luck.

While he’s also right about some non-NHL stats being notoriously unreliable (or outright absent), we do have the ability to estimate each of the above factors, again using pretty simple statistical techniques.

Game Sheet Analytics, which I touched on above, are something that Dylan Kirkby (our resident programmer) and I were able to get working just before the NHL draft this year, but others have been using them for quite awhile. Prospect-stats.com is entirely based on them, as was the now defunct chl-stats.com.

The value of game sheets is in the fact that, for many leagues, they list the players who were on the ice for each goal that was scored. In small samples, this is incredibly volatile and largely influenced by extraneous variables, but over the course of a season, you can get some pretty reliable conclusions.

The distribution of situational scoring is one of the main factors included in SEAL adjustments (the ‘S’ is for situational), and the system has built in adjustment coefficients for each type of point (from 5-on-5 goal to 5-on-4 secondary assist) that are based on which rates correlate strongly with eventual NHL production.

Ice time may be estimated by determining the percentages of team events (goals for plus goals against) a player was on the ice for, and multiplying that by the amount of available ice time. This can be done for 5-on-5 time, as well as for special teams.

Frequent linemates can also be found by keeping track of how often two players are on the ice for events together. We can also then determine things like goal shares and point rates for when two players are together and apart. The resulting WOWY charts have seen plenty of use on this site, including during our recent Top 20 Prospect Rankings series.

It doesn’t stop there though, as game sheet data can be manipulated further to estimate quality of teammates, quality of competition, rate stats, and more.

But if you’re the type that simply can’t shake the skepticism involved in estimating stats like this, there’s a solution for that too.

HockeyData Inc. and the Future of Prospect Analysis

Enter HockeyData Inc., the creation of the aforementioned Garret Hohl and business partner Cole Gawenda. The two have created an analytics company that prominently features prospect stats, with a team of employees who watch endless amounts of hockey games and track all minutia of data. That includes actual time on ice, on-ice shot differentials (the Corsi’s), and shot and pass location data that allows for expected goal and assist statistics that go far beyond what is publicly available for the NHL, let alone leagues below it.

HockeyData’s collection of statistics is unfortunately unavailable for public consumption. So time consuming is the work involved that it only makes sense to charge money for it, and so valuable is the output that plenty of teams are willing to pay. News of a contract with the Washington Capitals made the rounds shortly before the summer, but the company had plenty of NHL clients before then, and they’ll continue to pick more up as time goes on.

Because HockeyData’s information isn’t freely available, perhaps it doesn’t really fall into the category that Bourne was discussing in his article – hockey fans googling players that their teams have picked up, or simply shown interest in. Even without proprietary information like that though, the research and data that is currently out there for non-NHL prospects is pretty astounding, and it’s not something that you should turn a blind eye to. While some may not be satisfied by its lack of certainty, the amount of context that it can provide is undeniable, and it helps to answer a lot of questions and concerns that Bourne put forth. It’s time to get on board with this stuff, because it’s come a very long way, and it’s only getting better year after year.

  • Ranger2k2

    Great article Jeremy. I originally thought maybe Justin has a gag order from the Maple Leafs not to discuss how they track amateur players but after reading the article I don’t think he was criticizing or belittling the work that Canucks Army and others have done in this field but singling out ice time as a huge unknown.

    I can’t understand whey any NHL team wouldn’t be looking at the type of models that Canucks Army and other have put together to help them understand and draft players better. This isn’t meant to be a shot an analytics or the old “hockey is played on the ice not in Excel” but what drives coaches to play players in certain situations can only be understood watching and following the teams. I think that the cream usually rises to the top but for instance Bo Horvat was considered a penalty killing ace and a two-way player coming out of junior. If you look at his London Knights team from his final year of junior you can see that the points put him as the 3rd line center. In the NHL Bo has been a way better offensive player than defensive player and it goes against how he was used in junior. So were the Knights putting Bo in the best position to succeed or what was best for the team at that time. I’d say the Knights used Bo in the position that was best for the team and not best for his draft ranking. Maybe Jeremy can correct me on this but that is how I read Bourne’s article.

    • I don’t pretend to know Justin’s mindset at all, but I didn’t take his article as criticizing or belittling Canucks Army at all; I took it as he just wasn’t aware that some of these statistics and models existed, which is fine! I tried to stay away from criticizing him for that, and instead tried to stay on the side of being informative. I hope it came across that way.

      My experience with the Hunter’s and their approach to running the London Knights is that they always do what’s best for the team. It’s not their job to inflate the numbers of individual prospects. One of the things that makes them so successful (and, in my opinion, makes their players develop so well) is that they get their players to buy into that and accept roles, even if they are considered stars. We saw that with Horvat, and we saw it again with Olli Juolevi.

    • rakish

      I toy around with this myself. The problem with applying it to previous drafts is that you end up manipulating the model to get really good results, so I don’t do that. I’ve kept track the last 4 years, mostly drafting against Buffalo, and have found that CSS and McKenzie predictions are terrible (scouting has really bad results), Button had a great draft in 2014 (he still has 5 prospects from Buffalo’s draft), but not so much since, Pronman has really bad results. I’m really happy with my results (outside of 2015 anyway), but you got to take my draftyear plus 1 valuations as fact, which you won’t do for a few years. Canucks army played this season (the downside of publishing a large enough draft list is that I will keep track of your results), so we’ll see how their projections age. All the gory details are at 45b.us

  • copey

    It is interesting to look back at the stats for the 2011 draft: Scheifele vs. Couturier. Absolutely everyone denigrates the Jets picking Scheifele over Couturier, with the stats backing them up.

    But guess what!

    Couturier: NHL Totals 416Gm 70G 121A 191P 0.46P/Gm
    Scheifele: NHL Totals 306Gm 90G 137A 227P O.74P/Gm

    Scheifele has clearly been the better player on the ice, even with fewer games.
    So the eye test wins that one, though deployment and linemates need to be figured in.

  • EddyC

    Don’t you think that when a prospect enters a team like the Canucks and is asked to work on a specific part of their game that their stats could very easily suffer for a period of time until he becomes better. When I was younger I became a really good snooker player and as I became better and added new refinements my game always dropped until I could master what I was trying same with my golf game. It is no different for these kids. Joulevi has been asked to modify his game or body and until he figures it out his stats might suffer. Joulevi didn’t become a crappy player over the summer.

    • copey

      yeah; I bet the 20+ pounds of growth has wrecked his mobility for a while, until he gets used to it; and yes, he got turnstiled bad many times where he looks dead in his tracks; since he can’t play in the AHL, he’s way better off in Finland, hanging with Salo; d-men take time….

      • EddyC

        I totally agree with him going with Salo, when Salo was out of our line up we hardly won any games. His slap shot was key to our success on the PP. If Salo can get his slapper close to his we are going to be very happy with Joulevi

  • TrueBlue

    Cool synopsis. And with regards to the outside view of where prospects analytics are at, take it as a compliment that you’re ahead of the curve on the bleeding edge! 🙂

  • NeverWas

    So are these models just done via excel data analysis? (multiple regression, non-linear trendlines, weighted averages, probability, etc)

    Super interested in the actual process !