Chelsea Are Over-Exceeding Expectations. The xG Stats Prove It.

This is more than simply looking at xG. Last season saw some odd statistics and metrics for the Chelsea squad. Scoring goals was never much of an issue. However, preventing goals was much more of an issue. For example, the squad had very good, expected goal (xG) metrics, and followed up on those expectations with plenty of goals. The squad also had decent expected goals allowed (xGA), but underperformed on this metric by a highly significant margin.

The squad allowed 13 more goals (GA) than the xGA predicted. Some of this is likely down to Kepa being shot for all confidence and form last season, but nonetheless, still represented an area of concern for the squad. With new attackers, defenders, and goalkeeper, have the xG-to-G and xGA-to-GA ratios improved in the 2020/2021 season? With Mendy, Silva, and Havertz, Werner, and Ziyech in the squad now, the stats are likely to show improvement from a theoretical standpoint. But do these theories bear true in reality?

xGA to GA

Here, I correlated the xGA to GA from every Champions League and Premier League match from this season. Below is a simple scatterplot of this data.

Exceeding expectations
Ga to xGA

In this plot, there is a strongly positive and statistically significant correlation (r = 0.682, t = 2.798, p = 0.021). This is not a surprising result, and it does help to elucidate the importance of expected metrics. However, what sticks out is how the squad is often beating their expected conceded goals. For example, look at the bottom right of the above plot. The squad has kept 6 clean sheets, but in those six matches, the squad had a cumulative xGA of 3.4. This means that in those 6 matches, the squad conceded 0 goals, yet were expected to concede 3.4 total goals. This is important. This shows the squad is showing that they are reversing the trends from last season. Instead of conceding drastically more than the xGA would predict, as was the case last season, the squad is now significantly outperforming the xGA for the 2020/2021 season.

Another key takeaway that is more implicit is how often Chelsea have beat the xGA metrics this season. Out of the 11 matches analyzed here, Chelsea have conceded less than expected 8 times and only exceeded the xGA 3 times all season. This is highly encouraging that the squad is growing and now learning to defend more effectively.

xG to GF

Last season the squad had one of the better xG metrics in the league. To see if this trend is continuing from last season, I employed another scatterplot with correlation analysis. Below are the results.

xG table
G to xG

There is a very strong, statistically significant correlation between xG and GF (r = 0.823, t = 4.381, p = 0.0018). This is once again showing that the expected metrics are highly effective at predicting the actual goals scored. However, the implicit part of this plot is how often the squad is exceeding the xG metrics this season. For example, the squad has scored 27 goals in the 8 matches where the squad has not been shut out. The cumulative xG from these 8 matches, however, was only 16.9. The squad has over-exceeded the xG by almost 10 goals in these 8 matches alone!

That is an incredible statistic. However, the squad has registered 3 matches without scoring a single goal, and the cumulative xG in those 3 matches was 1.7. So, overall, in the 11 matches analyzed, the squad was expected to score 18.6 goals, and smashed that by scoring 27 goals! The offense is clearly a point of absolute strength so far, and as time, chemistry, and continuity continue to build between players and a formation, this likely will become an even larger disparity.

NOTE: Chelsea have either scored zero goals or 3+ goals in each UCL or Premier League match.

Possession Percentage to GA

It is hard to concede when a squad controls the ball and allows little possession to the opposition. Does this hold true for Chelsea this season? To analyze, another scatterplot was created.

GA to Poss - Much more indepth than simply looking at xG
GA to Poss

Here, there is a slight negative correlation that is not statistically significant (r = -0.007, t = 0.022, p = 0.983). The fact the correlation is very weak and that it is not statistically significant is good reason to assert that the possession percentage is not a good predictor for how many goals the squad will concede in a match. For example, when the squad had nearly 75% possession in one match, there were still 3 goals conceded. One trend that does emerge is that the squad has been able to keep clean sheets while possessing the ball more often and sitting back off of possession. This is evident by the 6 clean sheets, and within those 6 matches, the possession percentage varied from 44% to 75%. This suggests this squad is developing a strong defensive spine regardless of being on the ball or off of the ball.

Possession Percentage to GF

Just as having the ball is more likely to lead to goals conceded, it is conversely more likely to predict a squad will score more goals. Simply, if you have the ball more often, it is usually more likely that you will have attacking chances and therefore goals. We have seen the Chelsea squad keep possession in the past yet struggle at times to break down opponents, but has that trend started to reverse so far in the 2020/2021 season? Below is a plot of goals scored against possession percentage.

G to Poss

Here, there is a strong positive correlation that is statistically significant (r = 0.787, t = 3.826, p = 0.004). Chelsea have failed to score in 3 out of 4 matches in which the squad has had less than 50% possession across the match. Perhaps this is due to the game plan for each individual match, but it reveals this squad has difficulty sitting back, absorbing the opponent’s attack and possession, and hitting on the counter-attack.

Against possession-based, attacking squads, this trend shows cause for concern. Chelsea are not scoring goals when playing on the back foot, and that will have to change against the heavyweights they will face throughout the season. Squads such as Liverpool, Man City, and squads in the later stages of the Champions League will possess the ball more, and force Chelsea to score on the counter-attack. At a minimum, these squads will force Chelsea to create chances with limited possession, so this is a trend to watch the rest of the season.

Possession & xG

I would also like to argue from this plot that the squad is seemingly not being based on a possession-based philosophy. This is evident by 6 of the 11 matches having less than 60% possession, and 5 of the 11 matches having greater than 60% possession. Therefore, I assert that the squad will have to develop a way to score goals when sitting back more often, as it does not seem that possession is the overwhelming game plan for Lampard so far. Contrast these numbers to the last time were a possession-based club (under Sarri where the possession percentage was routinely above 60% in most matches).

NOTE: 2 matches this season have had a possession percentage of 69%, and each of these matches resulted in 4 goals scored. That is why only 10 dots appear in the plot above, but the analysis considers all 11 matches.

Possession Percentage to xG

Similar to the previous plot, I wanted to see if the xG to possession percentage reveals any useful information. Below is the plot and corresponding statistics.

xG to Poss
xG to Poss

This plot and analysis reveals a strong, positive correlation that is statistically significant (r = 0.685, t = 2.818, p = 0.020). This suggests that as the squad possesses the ball more often, they are more likely to score goals this season. This should not be surprising given the fact if one has the ball more, one is more likely to create attacking and goal-scoring chances, and therefore increase the xG metrics.

However, what is interesting, is the disparity between xG when having 50% or less possession and when not. When having 50% or less possession, the squad has a cumulative xG of 2.9 across 4 matches. In contrast, when having over 50% possession, the squad has a cumulative xG of 15.7 across 7 matches. Clearly, the squad is not creating nearly as many goal scoring chances when sitting back off the ball and having to defend more, which directly agrees with my previous assertions from the previous Possession Percentage to Goals Scored plot. The squad will have to increase the xG numbers when having less than 50% possession to find success against greater competition that displays an attacking, possession-based style of play.

Possession Percentage to xGA

Finally, the last plot for this overall analysis correlates possession percentage with the xGA. It is expected that as the squad possesses the ball more often, they would be less likely to concede. But, has this theory been true so far in the 2020/2021 season?

xGA to Poss

This plot reveals that there is a fairly strong, negative correlation that is marginally statistically significant (r = -0.534, t = -1.893, p = 0.091). This is intuitive: if one possesses the ball more often, one is less likely to concede, and the xGA metrics support this assumed intuition. What is unique in this plot is that the highest xGA metrics come from the matches in which Chelsea had 50%, 47%, 57%, 53%, and 75% possession. This suggests that Chelsea struggle to prevent xGA the most when hovering around even possession percentages as their opponent.

Again, this is consistent with previous analyses: Chelsea struggle to score expected goals and prevent expected goals against when not controlling the possession of the match. This is why I consistently argue that the need to develop a stronger counter-attack will be crucial for success later on in the season. This assertion has been consistent over several plots and corresponding analyses; therefore I feel safe arguing this for the future improvement of the squad.

Written by Travis Flock @crossroads_cfc

Edited by Jai Mcintosh @jjmcintosh5


Leave a Reply