Draft Analytics: Unveiling the Prospect Cohort Success Model

Josh W
May 26 2015 11:00AM

Prospects and amateur scouting are a new frontier within hockey analytics. Numbers can highlight some large market inefficiencies exploitable by teams if and when they start to advance their knowledge in this niche.  The numbers on prospects, especially when combined with knowledge of scouts, can tell you a lot more than the some of the largest mainstream scouting outlets.

In Money Puck's recent series of posts, he has touched on a proposed "re-tooling on the fly" method to help the Canucks sell off veterans for picks, pick up free agents to replace them, and then use those picks on prospects who could in turn help the Canucks in the future.  

You might have noticed him talking about PCS, or "Prospect Cohort Success." In this post, we will look more into what this technique is for analyzing prospects.

Translating Performance

One of the most popular tools for analyzing prospects is NHL Equivalency (NHLE) numbers, which convert a prospect's junior Points-Per-Game number to what they would be expected to score if they were to jump to the NHL the next year.  These translation numbers change all the time, are updated by hockey stats pioneers like Rob Vollman, and have been cited by nearly every hockey stats blogger at one time or another.

Having worked with prospect analytics for the past two years, I find there are many limitations when using NHLEs as there are a number of problems with them.  For example:

  • NHLEs assume the player will make the jump to the NHL, but that is not often the case.  Prospects often play through intermediate leagues and development and opportunities can shape their future as a player.
  • NHLEs are based on a linear regression method which moves all players towards the mean; this rarely happens.
  • NHLEs have to be updated as scoring and talent within leagues changes. This means the NHLE translation factors you would use today for the OHL do not necessarily reflect the same league a decade ago.
  • The issues in the translation factors are even more exaggerated when looking at leagues that are affected by smaller sample sizes.  
  • Seventeen year olds, who play in professional leagues in Europe have different roles than top 17 year olds in North America.  They are often depth players without special teams time which lowers their scoring relative to their NA peers.  But being able to play in a pro league itself is a strong signal that the player is likely to become a regular NHLer.  This will not be reflected in NHLEs, as they are not designed to consider talented under-age players in smaller roles.
  • Last, but not least, not all leagues have a translation factor.  Sure, ECHLers are unlikely to play in the NHL, but it is still useful to know how their probability of success, as some of them have gone on to the big leagues.

Prospect Cohort Success

To address many of these issues, I've been working on a tool called "Prospect Cohort Success" (PCS) with many of my peers here at the Nations Network (thanks to Money Puck, Garret Hohl, and Rhys Jessop for the work they have done so far).  While the model is complete, a publicly available and viewable tool is far from finished and is still in Beta.  When it is working at an acceptable fashion it may be made open to the public, similarly to the Hockey Projection Project.

For now, as we have had many questions on the PCS numbers we've cited in past articles and our planned use of PCS is only going to increase as the draft gets closer, it's important to address the common question of what is PCS and what does it mean? 

The idea behind PCS is that you can take a player and generate a list of comparable players (aka: "cohorts").  Knowing these comparable players, we can look at their success in other leagues beyond junior to estimate the likelihood of a current prospect becoming an NHLer, and what kind of player they could become.  

We currently are using three main attributes which we have found to be predictive of a prospects success: age, scoring rate, and height.  Height is a bit of a chicken and the egg scenario as typically prospects who are bigger are given more chances to play in the NHL and attract more attention at the draft, but they also have higher success rates.  If size bias exists in amateur scouting, it likely persists equally so with pro-scouting, general managers, and coaching.  Weight has not been found useful, most likely as it can easily change drastically up or down over a prospect's development.

While the PCS method is league agnostic and could be used for European or AHL depth players, currently, the definition of future success is players in the NHL who have played a minimum of 200 games. The 200 game mark is used as the minimum for "NHL success" based on previous work by TSN's Scott Cullen.

The number of players in the cohort who have reached the 200 game threshold determine the percentage of cohorts who have succeeded, or the Player Cohort Success percentage (PCS%) of the player.  Additionally, looking at the NHL points per game average of these cohorts will give you their PCS points per game - a ballpark estimate of how effective comparable players became at the NHL level.

What these values tell us is that we can look at a prospect of any age and in any league, and estimate their likelihood of graduating to the NHL as well as their most likely level of production as a player. 

As an example, here is the PCS% and PCS Pts/GP for Cole Cassels in his 19 year-old OHL season. Given his age, size, and production, he currently has a 23.2 percent PCS% and a 0.45 PCS points per game value.  This means that players who have performed like Cassels as a 19 year old OHLer have had NHL success in roughly 1 of every 4 cases, and have managed an average of 0.45 NHL Pts/GP.  These numbers are actually quite high for a player still in the CHL.  The list of Cassels' cohorts include:

Screen Shot 2015-05-24 at 8.48.03 PM

    Application 

    One of the main applications of PCS is to look at the PCS% of draft eligible players to see how likely they are to succeed based on how their cohorts have succeeded.  This can be viewed as a good handle of how "safe" a draft selection is.  Another use is to see how the development of a prospect is moving.  

    Cole Cassels

    Cole Cassels is a player that the Canucks fans currently are quite excited about due to his legendary 2015 playoff run.  Rhys Jessop completed some analysis on him and we can further that analysis to see how his development has gone over the years:

    cassels

    (Credit MoneyPuck)

    Cole Cassels development is moving in the positive direction as he has legitimately become better each season.  Next year in the AHL will be important as will have to continue to have success at a higher league

    Connor McDavid

    mcdavid

    Connor McDavid is a special case because he is legitimately a generational talent.  He has been so good at such a young age it becomes difficult to find talent similar to him.  Those who are similar have all gone on to play in the NHL and you have definitely heard of them.  It's also good to see that his ceiling continues to grow.

    Tyler Johnson

    tylerjohnson

    Tyler Johnson is a player that has become much talked about these playoffs.  He is playing a key role with the Tampa Bay Lighting and despite nearly scoring a point a game, he went undrafted as you may have heard. 

    When looking back to his WHL career (ages 17-20), it is clear that Johnson was not statistically likely to become an NHL regular due to the lack of scoring and his size.  The numbers agreed with the scouts - but both were wrong as he was a clear outlier of player development.  Tyler Johnson shows how development and luck/opportunities are important in making the final jump to the big leagues.

    An additional point to note is every time the PCS percentage jumps we call that a "graduation bump" - where a prospect who moves to an even tougher league (e.g. from CHL to the AHL), it means they're more likely to become an NHL regular as professional hockey management and coaches have determined they are worthy of a roster spot.  Scoring only adds to their likelihood of graduating.

    Conclusions

      This has been your introduction to Prospect Cohort Success - a tool we are still working on to help identify prospect success rates and ceilings.  There is still much work to do with the large amount of data we have.  

      Our future work in the near term is to look at era-adjusting every league to compare the adjusted numbers vs. raw scoring rates.  Additionally we are going to include more features such as Quality of Teammates (i.e. with Goals Created) and maybe through other features such as PIMs (which Ian Fyffe found to be a predictor of success).

      If this area interests you, we welcome and encourage any feedback!

        172ff756e336b4deef407cc7fc644369
        I'm just an hockey statistics analyst with a focus on hockey prospects. I currently live in Victoria, BC. In my free time I am either programming, writing, working on CHLStats.com or working out. While I spend too much free time in front of a computer I enjoy rowing, running and weight lifting.
        Avatar
        #1 pigeonbutt
        May 26 2015, 02:03PM
        Trash it!
        3
        trashes
        Props
        41
        props

        Ambitious, meaningful off-season post. Appreciate how you threw Johnson in there to demonstrate that exceptions exist with every model. Good stuff.

        Avatar
        #2 PB
        May 26 2015, 02:36PM
        Trash it!
        1
        trashes
        Props
        21
        props

        This is the best and most convincing explanation I've seen so far of the use of the size variable -- here you make it clear why even if success may be intensified by the bias towards taller players by the hockey establishment, it doesn't diminish the overall point that it remains a predictor of success at the benchmark of games played (not necessarily quality of that play). The cohort model you've developed seems like a much more convincing and rigorous version of something that we already tend to hear anecdotally in scouting reports, that prospect x resembles NHL player y and yet that is often based on little more than gut feelings. Good luck with this and looking forward to seeing the full working model.

        Avatar
        #3 akidd
        May 26 2015, 06:05PM
        Trash it!
        0
        trashes
        Props
        2
        props

        i'm not a stat guy by any means but this looks good. really good. i predict you will all become rich and famous.

        i'd be curious to see a comparative chart without the height factor.

        Avatar
        #4 wjohn8
        May 26 2015, 06:52PM
        Trash it!
        0
        trashes
        Props
        3
        props

        Nice work on this post. Clear and informative for us non-stats guys.

        Avatar
        #5 sebguru
        May 27 2015, 12:34AM
        Trash it!
        1
        trashes
        Props
        2
        props

        This is a moreare a more balanced post regarding statistical analysis, as there are factors that are impossible to model(at least for now) such as luck, development support, etc..... Kudos guys.

        As you pointed out there are always outliers from both sides of the spectrum. The Tyler Johnsons and the Connor McDavids. But there are others at the other end too such as the Alexandre Daigles of the world. I know that we always talk about sample sizes, but sometimes they also bring these outsiders into the mix that skew the average. Ideaaly it should be balanced and cancel each other, but reality sometimes doesn't follow logic..

        You are moving in the right direction, acknowledging the short comings, and going back again to revisit the assumptions, etc....

        Avatar
        #6 Spiel
        May 27 2015, 10:02AM
        Trash it!
        1
        trashes
        Props
        6
        props

        Good work!

        Something that jumps out to me is that all of the cohorts you have listed for Cassels are from 2004 or earlier.

        For me that raises the question of comparing across eras. Is the PPG adjusted for the era? Otherwise, doesn't your analysis fall victim to the same problem you stated for NHLE analysis that:

        "NHLEs have to be updated as scoring and talent within leagues changes. This means the NHLE translation factors you would use today for the OHL do not necessarily reflect the same league a decade ago."

        Maybe I am missing something obvious, but all the cohorts you have listed for Cassels are at least a decade old, so why is it valid to use pts/gm across eras in this case but not in the NHLE case?

        Avatar
        #7 Reuben
        May 27 2015, 05:44PM
        Trash it!
        0
        trashes
        Props
        2
        props

        Really cool stuff guys, with each article I become more worried that you guys are going to get snapped up by a pro scouting department before you finish this series of articles.

        Seriously though, I'm enjoying this next level look at prospect potential, it must be a ton of work, well done!

        Avatar
        #8 The Last Big Bear
        May 27 2015, 11:29PM
        Trash it!
        0
        trashes
        Props
        1
        props

        This methodology seems somewhat similar to the Viper(?... Vapor? something like that?) method.

        Which is a reasonable approach, but I'd expect this model to fall victim to the same kind of long-term trends that scoring-based methods do.

        For example, if more young high-end prospects choose the college route, or if young guys start getting less and less opportunities in the European elite leagues, it would still skew the probability of making the NHL for a kid playing in those leagues today. Just as surely as increase or a decrease in scoring in those leagues would skew a point-production-based model.

        Just an observation.

        Keep up the good work!

        Comments are closed for this article.