Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: Preakness 2012: News, Odds and More

Fun with Graphs

David Moyes ponders Everton's chances of winning given a one-goal lead at home in the 58th minute. (hint: it's 78.7%)

A two -goal lead is the most dangerous lead in football. We hear it all the time, but is it really true? What if you could know the exact odds of your favorite team (Everton, let's say, since this is an Everton blog after all) winning, losing, or drawing at any given moment in a match?

Well guess what? Now you can!

Star-divide

I've always been fascinated with statistics, especially if the statistics are sports-related. Now I'm certainly no mathematician, but with just a basic knowledge of the principles of statistics it becomes possible to look at sports in different and interesting ways. One such way is the "win probability" graph, which is a visual representation of each team's chances of winning over the course of a game. Win probability is most closely associated with baseball, as the people at Fangraphs popularized the idea. There have been win probability models for other sports as well (including soccer), but I decided to create my own.

Now before I start boring all of you with a lot of math-speak, I'll give an example of what I'm talking about. Here's a graph from one of the most famous matches in Everton's history: the Great Escape of 1994. For those not familiar with the situation, I'll paint the picture: Everton entered the final match of the 1993-94 season in severe danger of being relegated out of the top division for the first time since 1954. As it turned out, only winning would have secured Everton's safety... and that's exactly what they did, defeating Wimbledon 3-2 in an incredible game at Goodison Park:

Evrwim050794_medium

So what do all those lines mean? As you can see, the graph shows the chances of an Everton win, a Wimbledon win, and a draw over the 90+ minutes of the match. With this information, we can see that at two different points in the game (around the 20th minute and the 67th minute), Everton essentially were 95% sure to be relegated, which goes to show just how amazing it was that Everton managed to avoid the drop. Cool, huh?

Win probability generally considers two possible results (a win or a loss), but in soccer there are of course three possible results (win, loss, or draw). For that reason I've decided to call my system "outcome expectancy" rather than "win probability," as I think this better describes the information the graphs are conveying. I really like these charts because they almost read like an "emotional barometer" over the course of the match, with the same peaks and valleys you experience during the game as a fan. Anyway, I hope you guys find these half as interesting as I do, and I'll definitely try to include them with my match reports going forward. Just for kicks, here's the visual representation of Sunday's match with West Brom:

Wbaevr010112_medium

Now I would like to provide anyone who's interested with a little background on the mathematics that went into creating this model, though I can understand if most of you would prefer to skip the technical stuff, so fair warning before you read on...

To create the model, I used something called a Poisson distribution. This basically means that you can figure out the probability of an event if you first make some assumptions. In order to use a Poisson distribution, you have to be dealing with a fixed interval of time (in soccer, that's the 90+ minutes of the match). Then, you have to assume there is a known average rate (in this case, how many goals each team can be expected to score on average) and that the events occur independently (in other words, a goal being scored does not effect the average rate at which future goals will be scored).

Now if you've been paying attention, you've probably already spotted a problem. As anyone who has watched a lot of soccer can tell you, goals do not occur independently. For example, if the away team goes down 1-0, that generally opens up the game and increases the likelihood that more goals will be scored. This means that a soccer match is not a true Poisson distribution, but for our purposes the numbers are close enough to continue on.

In order to create the model, the first thing we have to do is find the known average rate. How many goals should we expect a home and away team to score respectively? Using the last ten years of Premier League data, I determined that on average a home team scores 1.508 goals per match and an away team scores 1.098 goals per match. With that information, you can determine the probability percentages for any possibility over the course of the game.

Another important thing to note is that these graphs assume that both teams are of equal skill level. This is a necessary assumption in order to make the statistics work properly. According to my calculations, on average the home team should win 46.8% of the time, the away team should win 27.5% of the time, and the teams should draw 25.7% of the time (last year's actual totals were 47.1% home wins, 23.7% away wins, and 29.2% draws). Now of course, if Everton are playing Manchester United at home they probably won't have a 46.8% chance of winning the match. Therefore, the best way to look at these graphs is to consider them in the context of the actual match-up on the field.

One more caveat... these models don't take into account the effect of a player being sent off. I'm currently working on adding that variable into the mix, so that should be in version 2.0.

Got all that? Whew. If you have any questions about anything related to this stuff feel free to comment below!

Comment 5 comments  |  0 recs  | 

Do you like this story?

Comments

Display:

This is some great stuff

It’s really hard to incorporate stastics into a sport like football. If you’re a big math nerd like me, then baseball is heaven because of the sheer amount of statistics analysed.

I love the new win probability statistic you came up for football, but I have to be honest…It’s not nearly as fun/informative (yes, its fun for me) as the baseball version just because of the overall lack of scoring. THere is much more activity in every other sports just because there are major momentum changes as well as scoring changes in a single game. In soccer it is much rarer, and you would need a match like yesterday’s Man U – Blackburn match in order to get that type of activity within the graph.

Either way, more graphs = more awesomeness.

@sibsinExile

by SibiGnana on Jan 2, 2012 11:46 AM GMT reply actions  

People are certainly trying

A lot of clubs are snatching up baseball sabermetricians to try and give them an advantage. For instance, I know that Voros McCracken does work for a club which he won’t name. Football could get there. The stadiums would have to start putting in tools to the stadiums, like baseball has with Pitch F/X. Win Expectancy is more fun in baseball, though, because of how erratic the nature of the game is with runs scores and runners on base. This is good stuff though!

by Rebuilding Season on Jan 2, 2012 7:36 PM GMT up reply actions  

A lot of stats exist for soccer

The problem is they are proprietary stats developed by companies, and to use the data you have to purchase them. One of the local private high schools actually has a program that allows them to record every dribble, pass, and shot on a computer, as well as who did each action, but the problem is they do not have any real method of analyzing it in a certain way

by Brian_Goodison on Jan 2, 2012 10:08 PM GMT reply actions  

Thanks for the feedback guys!

I do agree that win expectancy in football is not quite as interesting as in other sports that feature many changes in the status of the game, though there are certainly some matches that would make for some pretty wild graphs (the Blackpool match last year comes to mind). I’m actually glad for that though because it made the math simple enough that I could handle it!

It’s definitely a challenge to incorporate advanced statistics into football, and like you say Brian what is out there usually costs money to access (like matchanalysis.com, whose clients include the national teams of Germany and Mexico as well as several German, Mexican, and U.S. club teams). Hopefully in the future these stats will become more mainstream.

by seanathan on Jan 3, 2012 1:36 AM GMT reply actions  

Excellent job Sean.. I’m a math geek myself and am always on the lookout to see where stats and numbers can be incorporated in sport.

Smile.. tomorrow will be worse.

by Calvin on Jan 4, 2012 2:02 AM GMT reply actions  

Comments For This Post Are Closed


User Tools

Welcome to Royal Blue Mersey - the web's newest Everton blog!

FanPosts

Community blog posts and discussion.

Recent FanPosts

434477_small
The secret behind our annual early-season swoon?
Russ_3_small
Fulham vs. Everton Preview Show On Cottage Talk
Hookem_small
Everton U-19s In The Dallas Cup
Small
Blue Union......R.I.P

+ New FanPost All FanPosts >


Authors

Rbmpp_small seanathan

470_everton_stallone_470x300_small SibiGnana

255527_10150635710225548_732280547_18873893_1355953_n_cr_small DarrenMelling

Small tnelson1878

Toffeedan_small ToffeeDan

Juve30sq_small Calvin

Astronaut_small Mike_L_Goodman