Looking At Expected Goals in Football, Can It Be Relied On?

After cooling off from the drama that took place in Inter and Atalanta’s weekend match, I took a look at the highly controversial expected goal (xG) outcome of the match to see if our eyes deceived us at all during the viewing. From the eye test, the match seemed even, both Inter and Atalanta had long spells of possession and dominance, and most would probably agree that both teams would be happy to walk away with a draw. However, xG tells a completely different story, a lopsided one in fact. Both fbref.com and understat.com suggest Inter was the much more dominant team, winning the expected goal battle at 3.9 – 2.1 (via fbref.com) and 3.52 – 1.50 (via understat.com). You can take away Dimarco’s missed penalty and the gap closes by 0.7 goals, but the point still remains that both metrics definitely don’t agree with the eye test and don’t agree that Atalanta should have drawn the match.

In my view, football is far behind the ball when it comes to advanced analytics, and I personally think the team that embraces a hybrid analytical and classical approach will set itself up for success much faster than its opponents shying away from analytics. The reason behind football analytics, I think, is twofold.

1. The lack of good available free information for the population to consume. Fbref has led the way in bringing some Statsbomb data to the public, but it is still lacking in providing all the information that can be compiled in a football match. Compared to baseball or American football, it’s eons behind in providing data to its fans. Yes, some parts of football are more difficult to signify with a single number. But if you look at Major League Baseball, it provides a plethora of free information about every pitch and batted ball that have opened up significant doors to the game’s ever-evolving strategy. So much so that casual fans who turn into blog writers frequently go onto work in Major League front offices. The collective opinion of the fan has made the game smarter, and is evolving at hyper-speed.

2. Football is so dynamic and feels much more harder to quantify. Baseball is easier – there’s one explicit event with an outcome, and a whole slew of metrics that can be registered and calculated in the one controllable event. American football is the same, but to a lesser degree than baseball. Football, on the other hand, can feel like one never ending event that has no logical start and end point. In addition to constant positional shifting and less defined structure it’s going to be harder to gather pristine data (or if it is being gathered, I’d like to see some of it!)

More coming on Malinovskyi’s weekend banger

Now with xG, it is currently the best aggregate number that football fans have to make a statistical argument about a game. But even with xG, its evident we’re not there yet. No knock on understat.com who has made itself vulnerable by making its aggregation of each game available for public viewing – but rather than panning the work being done, what can be done to improve upon it?

Handling Stretches of Play with Multiple Goal Scoring Opportunities

It’s no surprise to anyone but goal scoring is binary, either you score or you don’t, a 1 or a 0. xG’s aggregation of probabilities adds nuance into how dangerous a chance is, but when does it go too far? Imagine an opportunity where an attack goes up against a goalkeeper who is playing out of his mind. The keeper stops 3 consecutive shots in which each shot has a xG of 0.50, and then a striker finally taps in another shot with an xG of 0.50. The xG for this stretch of play would be 2.0, but should it be 2.0 or 1.0? A similar, yet less extreme example happened to Atalanta, check it out, (apologies in advance for the choppy videos! Paramount+ is horrible for recording material):


Federico Dimarco gets a nice shot off, good enough to warrant a 0.34 xG. Juan Musso parries it away nicely, but unfortunately it went right to Edin Dzeko for a tap-in, a goal worth 0.81 xG. This brief stretch is worth 1.15 xG, but the maximum goals that could ever be attained from this situation is 1 goal. So what gives? It’s important to demonstrate the worth of each shot, but when the anticipated goal from the situation is far beyond the amount of goals that can physically be scored in that stretch of play – some sort of cap has to be put in place. A simple fix could be a ‘stretch of play’ adjustment that is deducted at the end of the match. Therefore no team’s expected goal margin can be inflated by a bombardment of shots in quick succession (whether a goal is scored or not). In this example for Inter, the solution would be to subtract 0.15 goals from it’s final total.

Adjustment for Skill, Distance, and a General Thought on Probability

Take a look at these three shots, and guess what the xG is for each of them:

Domenico Berardi’s goal against Atalanta
Ruslan Malinovskyi’s goal against Inter
Hakan Calhanoglu’s long range shot pushed wide

How close were you?

Berardi: 0.01 xG
Malinovskyi: 0.03 xG
Calhanoglu: 0.01 xG

Calhanoglu’s shot feels appropriately rated. It’s a long range shot with a prayer, and on a given day you can convince me that he would hit that shot, especially with Musso scurrying back to goal.

However, it’s astounding to me that Berardi and Malinovskyi’s goals were rated so low – and Berardi’s and Calhanoglu’s shots were equally rated! It’s extremely difficult and subjective to account for skill in an objective metric, but skill has to be taken into account somehow. Berardi’s goal is a trademark cut in for the lefty, and I extremely doubt if he took this shot 100 times he would only score once.

Similar with Malinovskyi. He’s a bit far from goal, but with no defender closing him down and with one of the best left foots in Italy, it seems much more likely he’d score this goal 10-15% of the time rather than 3% of the time.

But how is this rectified? Understat.com’s algorithm is based off of a six-figure database of past shots, but something still feels wrong about it. It’s one thing if Jose Luis Palomino is taking the shot from Malinovskyi’s position, but a player much more apt to convert feels unfairly rated below his potential – especially if a player has a clear path to the keeper from a shot outside the box.

I don’t know if there is an answer to this, so this is more of a call to use caution when checking the stats, and take into context the player taking the shot and the match environment around him when the shot is released. If Malinovskyi is given a clear shot to goal from 25 yards, I’d ask for that shot is often as the defense will give it to me.

A Call For Critical Thinking

Again, I think xGoals are great, but need to be used in the proper context. The metric still seems to be in its infancy, and even when Expected Goals get much closer to incorporating all the subtle nuances of a match, it still needs to be assessed within the flow of the match.

Football analytics is growing, and we have to make sure not to fall in the trap of purely relying on the statistics or purely on the eye test. Baseball already went through the headbutting between the stat nerds and the old-school purists, before coming to some sort of coexistence. I envision football’s time will eventually come – it’s just not there yet.

There are murmurs that these conversations are already starting. Julian Nagelsmann recently posited that it would be great for players to wear earpieces similar to what’s seen in the NFL. His comments were met with both agreement and scorn, and was a good test to see where the sport is in regards to change. xGoals, even with its faults, may be the best we have right now, but a few changes and more openness on how the statistics are calculated could do a whole lot to getting more fans on board with statistical advancement in the sport – making for more educated management as well as fans.

Nick