On Prospect Evaluation and Uncertainty

I had a conversation with Harrison Mooney last week. Well, it was more of a one-way discussion. I smiled and nodded as Harrison talked at me about how he thought I evaluated prospects, and the weaknesses he thought my approach had.

Despite being drunk at the time (it was his birthday), Harrison raised some really valid points and criticisms – points and criticisms that I thought were relevant and understandable enough that I needed to address. To paraphrase him: “Rhys, I like your stuff. But when I read it, you just seem so sure on these kids. Like you’re certain you know how they’re going to turn out. But you don’t know, you can’t know. They’re so young!

The thing is, he’s pretty much completely right. We can’t know how these kids are going to turn out, so making statements like “Bo Horvat is destined to be a 3rd line centre” are asinine and disingenuous. This is also why I try to avoid making statements like “Bo Horvat is destined to be a 3rd line centre.”

“But wait, Rhys. Didn’t you–” Yes and no. We’ll get to that past the jump.

When we speak in the language of prospects, we speak in the language of uncertainty and probability, of grey areas and likelihoods. Player development is such a nebulous concept with so many unknown variables that it’s impossible to project exactly what a given player will become while they’re still junior-aged. As this is the case, the best we can do is look for trends, comparisons, and similarities and base our projections off of what the historical data tells us is both possible and likely.

I like to look at player development as if it’s a bell curve. A given player will reach his statistically most probable talent level most of the time, but sometimes a guy will come out of nowhere develop in a tremendously improbable way – think of guys like Milan Lucic or Patrice Bergeron. Conversely, guys with track records that are as good won’t develop and, also improbably, end up as busts, like Angelo Esposito, Gilbert Brule, or Marek Zagrapan. Drafting Bergerons won’t find you Patrice Bergeron most of the time, just as drafting Brules won’t find you Gilbert Brule most of the time.

This thought process is probably more easily conveyed through visuals. Here’s a diagram to illustrate this point:


If we had a million Bo Horvats (or more precisely, Bo Horvat-like players) and simulated a million unique careers, I suspect the frequency of each career outcome for each Bo Horvat would look similar to the above diagram. Most of the time, he’ll develop into the player predicted by his talent level observed in junior and what his counting stats predict. Some things will go well, some things will result in setbacks – nothing’s perfect in life. In rarer cases, he’ll be drafted and developed in an extremely favourable situation where he gets the perfect breaks and ideal resources he needs every step of the way to become Patrice Bergeron. In other equally rare cases, he’ll blow out a knee in his draft+2 season and become a total bust, or at the very least fail to reach his most probable potential.

Of course, the vertical “observed junior talent level” line is different for each individual player. Some players are Nathan MacKinnon and score boatloads of points at an absurdly young age, so their red line sits further to the right, closer to “superstar” than most players. As a result, we can predict that their most likely career outcome will be closer to “superstar” than guys that don’t score as much in the CHL. We can represent this on a type of NHL talent “spectrum” as follows:


The taller and skinnier the curve, the more certain we are that players will reach their average observed talent level (which, remember, is basically the mid-point of the curve). We’re quite certain that Nathan MacKinnon will be a top-end player with a really good chance of developing into a legitimate NHL superstar. We’re less certain that Nikolaj Ehlers will reach the level projected by his Stamkos and Seguin-like draft year boxcars simply because we know a lot less about him. We only have one year of CHL data on Ehlers, while MacKinnon has kept some pretty elite NHL company so far.

While Ottawa Senators prospect Curtis Lazar is likely a less effective NHLer than Ehlers, this doesn’t mean he’s destined to be. In fact, there’s a not-insignificant chance that Lazar becomes a better NHL player than Ehlers. Still, this shouldn’t change what we believe is most likely to happen, not should it change what we can reasonably expect from each prospect going forward. As friend-of-the-blog Garret Hohl likes to say:

The questions to ask about prospects (and the ones we need to answer in the future) are two-fold:

  1. Where on this spectrum does each individual players’ “red line” sit?
  2. Based on all comparable players to be drafted into the NHL, what is the exact shape of each individual players’ development curve?
If you’re an NHL organization with millions of dollars worth of scouting and analytics resources at your disposal, it should be possible to model all of this through the marriage of “hard” numbers and “soft” information. Sports Illustrated’s recent cover article on the Houston Astros describes how their management team led by a former rocket scientist and an ex-entrepreneur are attempting to do just that:

To that end [Astros’ Director of Decision Sciences] Sig Mejdal and his analytics team…created an evaluation system that boils down every piece of information the Astros have about prospects and players into a single language. The inputs include not only statistics but also information—much of it collected and evaluated by scouts—about a player’s health and family history, his pitching mechanics or the shape of his swing, his personality. The system then runs regressions against a database that stretches back to at least 1997, when statistics for college players had just begun to be digitized. If scouts perceived past players to possess attributes similar to a current prospect, how did that prospect turn out? If a young pitcher’s trunk rotates a bit earlier than is ideal, how likely were past pitchers with similar motions to get hurt?

Side note: you should absolutely read that article, but do so knowing what happened next. Here’s the link again. SI’s look at the Astros draft tells the story of how they came to choose left-handed high school pitcher Brady Aiken 1st overall this past season after Sig Mejdal’s hybrid stats-and-scouts model deemed him the best player available. The Astros would sour on Aiken in a matter of days after the draft though, as an MRI of his pitching elbow revealed something concerning enough to let him walk away from the organization. That story is also well worth reading, and is here. The whole mess reminds me of the famous quote widely attributed to economist John Maynard Keynes: “when my information changes, I alter my conclusions. What do you do, sir?” But, I digress.

Unfortunately, we as fans and disconnected observers are limited in the information we have access to. We don’t have the resources to travel to watch Swift Current play in Moose Jaw on a miserable Tuesday night in late October to see Canucks Army favourite and future Dallas Star Julius Honka play against Canucks Army favourite and future Tampa Bay Lightning Brayden Point. But we have numbers. We don’t have great numbers – we’re limited to boxcars that we know are highly variable and some height/weight/birthdate data – but we can still make pretty good estimations and predictions. Gabe Desjardins has showed that exceptionally strong boxcars are, in general, a good predictor of future NHL success, and a simple model based off of just boxcar stats can perform at a level comparable to a good number of NHL teams.

The adoption of rigorous statistical analysis isn’t about being right all the time. It’s about being right more often than you currently are, and limiting the number of times when you’re wrong. We have enough information at our disposal to be right more often than we’re wrong, and the track records of NHL teams at the draft table shows that certain NHL franchises (that may rely primarily on the eye test) are not necessarily significantly better at projecting players than a bunch of guys with access to basic statistics, computers, and working brains.

In an ideal world, the NHL teams would be far better. They should be able to identify where a player’s “red line” sits on the career spectrum with more accuracy, and should be able to observe taller and thinner development curves since better and more thorough information should lead to less variance in projections. The world is far from ideal though, and it is more than fair to be critical of prospects given what we know. Do the Canucks know more about each kid than fans do? Yes they do. Do the Canucks know more relevant information about each kid? Are they able to synthesize this information into something better than what we can? One would like to think so, but the evidence we have suggests that neither of these latter questions can be answered with a resounding “yes.”

To conclude, our Canucks Army Prospect Profiles series is starting soon. I’m sure we’ll all be asked to project each player somewhat, so I want to make clear the fact that there’s a difference between a likely outcome and an optimal outcome. Saying stuff like “Bo Horvat is likely a future 3rd line centre” doesn’t mean that Bo Horvat can’t not be a 3rd line centre, nor does it mean that there’s not a significant chance that Bo Horvat develops into a 2nd line centre, nor does it mean there’s a 0% chance that Bo Horvat becomes The Next Milan LucicRyan O’Reilly. All it means is that the coin we’re flipping is weighted in a certain direction, so a rational person should expect the outcome the coin is weighted towards most of the time.

  • McRib

    “We have enough information at our disposal to be right more often than we’re wrong, and the track records of NHL teams at the draft table shows that certain NHL franchises (that may rely primarily on the eye test) are not necessarily significantly better at projecting players than a bunch of guys with access to basic statistics, computers, and working brains.”

    That’s actually just the nature of the draft when 31 parties (30 NHL teams and Sham Sharron) are competing over the same talent pool.

    But, please, sell us another black box…

  • “The adoption of rigorous statistical analysis isn’t about being right all the time. It’s about being right more often than you currently are, and limiting the number of times when you’re wrong.”

    If there’s one thing I wish people could understand about stats generally, it’s this. Good ol’ Steve Simmons posted another daft article the other day saying BUT NEW JERSEY as if that meant that Corsi was useless. But Steve, the *six other teams* with elite CF% all made the playoffs, as did 10 of the 13 teams with 50%+ on the year.

    No sane person claims a good data and analysis will let you perfectly predict the future – just that it will let you more accurately predict the future than if you didn’t have that data and analysis.

    Really good article Rhys! And be careful of that Mooney character – he’s a bad influence.

  • McRib

    I think part of the issue is perception — in pushing back against the gritty truculent Simmons of the world, we often hear predictions that seem just as definitive (and just as difficult to prove) as theirs. What you’ve posted here is a well-reasoned and rationale approach to how one might use trends and probabilities to help predict future performance. What we sometimes get from CA often sounds more like high school popularity contests that speak with precisely the certainty that you’re cautioning against. You can be as revisionist as you like, but the majority of posts here have a tone that relegates Horvat, Gaunce, Shinkaruk, McCann, and Virtanen to no better than third line status as a (proven) starting point in many discussions and arguments. It would probably be good for all your CA souls to give this article you’ve posted a closer and sober read. How ironic that it took a drunken Mooney to inspire it…

  • Dimitri Filipovic

    I think any intelligent fan over the age of 30, or so, has already figured a lot of this out. You mention Horvat and have already pegged his future. B.S. You have no idea what he’ll become. He has a body of work and is still very young. There are so many factors that come into play that you cannot tell. Work ethic; dedication;injuries; influences; training and so on. That is why it’s pretty tough to take many of the articles here too seriously – they just don’t get it and think they know it all. Trying to claim they have it all figured out is a testament to their level of ignorance.

    Anyway, there is some reason to be optimistic about our prospects. However, until they crack an NHL roster, it’s all speculation or educated guess-work.

  • Cokebadger

    Ted, try rereading paragraph #3. It addresses your concern of pegging Horvats future.

    Put enough money and resources behind the number guys, and they will figure out how to get an edge. Alien abduction insurance exists, and multiple claims AND premiums have been paid. I submit this as proof that we should all bow down to the number crunching NERDS.

    At the very least, this type of analysis will give us one day reprieve from the ever popular “he wanted it more” debate.

  • JCDavies


    I think many Canucks Army readers understand probabilities and bell curves (after all, they are introductory concepts taught in first-year stats courses – either in High School or University) and many of the readers that don’t can probably work their way through and understand a well written article. And I obviously don’t know what was said in your conversation with Mooney but I have read his work and I have a hard time believing that he does not understand probabilities either. So if your takeaway from the Mooney conversation is that an article explaining bell curve distributions is what’s needed, I think you’re probably missing the point. With the amount of criticism that Horvat piece received, it seems a little ridiculous that your first thought is that your readers don’t understand simple statistical concepts and not that you should re-evaluate how you use probabilistic concepts in your writing.