Can we predict who will win a football match?
For a long time, statisticians thought the answer was basically ‘no’
With the 2024 European Football Championship on the horizon, I thought I’d do a couple of posts on the quirks of football analysis, and what they can tell us about wider statistics and science, based on my research for The Perfect Bet.
It all started with an exam question. In the 1990s, statistics students at Sheffield University were asked to predict which team would win a hypothetical football match. It was something of a throwaway example, an illustrative problem to test students’ knowledge of processes that involve randomness.
Except it turned out to be much more than that. The idea contained within that question – and its solution – would end up challenging long-standing notions about the role of chance in the beautiful game. It would also lay the foundations for an entirely new industry of sports analysis in the UK.
Football predictions would never be the same again.
From seasons to single matches
During the 1970s and 1980s, it seemed there was an inherent limit to what football analysis could do. Although expert predictions about performance over a whole season tended to correlate with actual league positions, predicting individual matches was much harder.
Unlike the high-scoring outcomes in basketball, or the structured pitcher-vs-batter duels in baseball, football matches can hinge on a single event. Victory – or defeat – may come down to whether the ball hits the post, or deflects unpredictably. One 1971 paper even concluded that ‘chance dominates the game’.
If we want to predict an event – like the result of a football match – it helps to have three things. First, we need data on results in historical matches. Second, we need data on the factors that might influence the result of a match. And finally, we need a way to convert these factors into a useable prediction.
Which is where the Sheffield exam question came in. Rather than try and predict the result outright, as previous studies had tried, it instead asked students to predict the number of goals each team would score. The different possible numbers of goals – and their accompanying probabilities – could then be converted into a prediction about the overall result.
It was quite a simplistic method, but statistician Mark Dixon saw something promising in there. Along with fellow researcher Stuart Coles, he would expand the ideas and eventually apply them to actual football league data. The resulting method – now known as the ‘Dixon-Coles model’ – would be published in 1997.
So, how did it work?
Measuring quality
Suppose we want to calculate the number of goals a team might score. One of the simplest ways to do this is to specify the typical rate at which they score goals on average – perhaps based on some kind of ‘quality metric’ – then calculate the probability they will score 0, 1, 2, 3 etc goals over the course of a game, based on this assumed rate.
Statistics fans will recognise that, if we assume goals occur independently of each other over time, we can calculate these probabilities using a Poisson distribution. For example, if a team on average score 2 goals per game, then the Poisson distribution – which captures the randomness at which these goals might occur over time – would predict the following probabilities of scoring different numbers of goals:
But there’s a problem. Here we are capturing team quality in a single metric, whereas in reality several factors might influence how one team performs against another. This is why one analysis of the 2008 Euros found that the ranking of two teams was a very poor predictor of who will win a particular match. We’re just not capturing enough information about performance if we try and distill quality down into a single metric for each team.
The basic Dixon-Coles model ended up including two quality factors for each team (attacking strength and defence) as well as a factor to control for home advantage. Suppose one team H, playing at home, play team A, playing away. The expected number of goals per match scored by team H against team A in the model was therefore given by:
Team H attacking strength x Team A defensive weakness x Home advantage
Likewise, the expected number of goals scored by team A would be:
Team A attacking strength x Team H defensive weakness
But Dixon and Coles soon noticed this basic approach had some flaws. First, the above assumes scores are independent; if team H are likely to score 1 goal, it doesn’t affect the probability that team A will score 1 goal. They found this assumption was reasonable for higher scoring games, but didn’t capture the number of low scoring games (0-0, 1-0 etc) in the data.
Another flaw is that attacking strength and defensive weakness won’t necessarily be the same over time. Some teams may get stronger or weaker over season. So as well as adjusting the model to make low scores more likely that a simple Poisson process would predict, they allowed team quality to vary over time, so recent performances mattered more than those in the distant past.
Beating the bookmaker
With these tweaks, the Dixon-Coles model was sufficiently good at predicting matches to outperform the results implied by bookmakers odds. In other words, it paved the way for a profitable betting strategy.
But what about tournaments like the Euros? In football leagues, teams play each other repeatedly, a bit like a research group repeating experiments under a range of conditions. In contrast, tournaments leave us with much less data to go on. Teams can play teams they’ve never played before, fielding players who may never have played together.
This data gap is illustrated by the variation in betting odds we can see between bookmakers during tournaments. One 2008 analysis found that ‘arbitrage opportunities’ – where match odds differ so much you can profit by backing both teams with two different bookmakers – were particularly common during the Euros.
Researchers – and bettors – now have much more data available than in that 1997 Dixon-Coles paper, but the fundamental gap between predicting a league and a tournament remains. And yet, ultimately, this is what makes tournaments like the Euros so watchable. Assumptions about quality break down more easily. Expectations about score lines are more likely to be overturned. And, like frustrated students grappling with randomness amid the time pressure of an exam, we watch as pre-match predictions struggle against the chaos of chance upsets.
And if you’re interested in reading more, here’s a previous post on football analysis:
Greetings everyone