In Search of the Greatest World Cup Goal Scorer – Part II: Adjusting for Word Cup Goal Average

As mentioned in the previous post, goals scored are not entirely comparable. While today’s players get more games, the games played in the early years of the World Cup had more goals. So, in a way it was easier for Just Fontaine to rack up 13 goals in just 6 games, when in 1958 there was an average of 3.6 goals/game. Here is a quick visualization of the World Cup Goal Average (WGA) over the years:

Wordl Cup Goal Average

As we can see from that graph, there were a lot of goals before 1960 and since there is a small but steady decline. Maybe things are pointing up again after Brazil 2014, but I doubt it. The solid line is a LOESS trend-line which should take out all random variation. As we can see, the huge outlier 1954 is taken as such, while most values are scattered closely around the line. Of note is also that this way, the 1990 World Cup is really seen as an abnormality. Whether this is actually the case is a good point for discussion: prior to 1992 a goalie could handle a back pass from his own players, which allowed a defense to manage a lead much better. But also, the weather in 1994 was a lot more demanding which often leads to more goals. So, let’s see which is better, the raw data or the smoothed line.

So, how do we adjust now? Well, first of all we need a reference point to compare the value of a goal of say 1950 to say 2010. To me it makes most sense to put everything in reference to the current (i.e. most recent) standing and that would be the 2014 World Cup. Let’s stick with raw data at first and we have a reference WGA of 2,67 goals/game. Now goals scored in tournaments with a higher WGA should count less, while goals scored in tournaments with lower WGA should count more. To achieve this, we divide the reference WGA, WGAref, by the individual tournament WGAs, WGAt, and compute the tournament goal values as

GVt = WGAref/WGAt.

Here is the full list of these goal values (based on raw data as well as LOESS smoothing) with reference 2014 (i.e. a goal at the 2014 World Cup has value 1.00):

Year Raw Avg Raw GV LOESS LOESS GV
1930 3.889 0.687 4.049 0.618
1934 4.118 0.649 4.118 0.608
1938 4.667 0.573 4.156 0.602
1950 4.000 0.668 3.999 0.626
1954 5.385 0.496 3.724 0.672
1958 3.600 0.742 3.400 0.736
1962 2.781 0.961 3.107 0.806
1966 2.781 0.961 2.895 0.865
1970 2.969 0.900 2.768 0.904
1974 2.553 1.047 2.741 0.913
1978 2.684 0.995 2.695 0.929
1982 2.808 0.952 2.626 0.953
1986 2.538 1.053 2.565 0.976
1990 2.212 1.208 2.545 0.984
1994 2.712 0.985 2.547 0.983
1998 2.672 1.000 2.566 0.976
2002 2.516 1.062 2.494 1.004
2006 2.297 1.163 2.421 1.034
2010 2.266 1.179 2.454 1.020
2014 2.672 1.000 2.503 1.000

Now, how do we use these GVs? Well, let’s look at Miroslav Klose as an example. In 2002 and 2006 he scored 5 goals each, in 2010 it was 4 goals and an in 2014 2 more goals. So, his time adjusted goal tally using raw data is

Gadj= 5*1.062 + 5*1.163 + 4*1.179 + 2*1.000 = 17.844.

So, his 16 goals in 4 tournaments from 2002 to 2014 are the equivalent of 17.844 goals in 2014. Given that Klose scored most of his goals in tournaments with low WGA, it was obvious that his score would get slightly inflated. We can of course do the same for the smoothed WGAs and arrive at an time adjusted smoothed goal tally of 16.268 for Klose. With this method it is of course hard to see how Klose could be unseeded from his top spot in the overall scorer standings:

Name Goals Gadj Gadj,LOESS Rank LOESS
1 GER Miroslav Klose 16 17.844 16.268 1
2 BRA Ronaldo 15 15.987 15.033 2
3 GER Gerd Müller 14 13.187 12.698 3
4 GER Jürgen Klinsmann 11 11.551 10.791 4
5 ENG Gary Lineker 10 11.148 9.790 7
6 GER Thomas Müller 10 10.897 10.100 5
7 ESP David Villa 9 10.386 9.202 11
8 POL Grzegorz Lato 10 10.269 9.205 10
9 ARG Gabriel Batistuta 10 10.004 9.812 6
10 BRA Pelé 12 9.974 9.706 8
11 FRA Just Fontaine 13 9.648 9.571 9

OK, that surely did not improve things compared to the unadjusted table. Adjusting for goal value demotes the players from yesteryear and promotes players that played more recently – especially in 1990. Fontaine’s incredible 13 goal tally is now worth a lot less, while Lineker and Klinsmann get quite the boost. If we use the smoothed version, the Top 4 remain the same, but Pelé and Fontaine do not drop as much. Kudos to Gerd Müller who as a more senior player steadfastly remains in the Top 3.

So, how about adjusting for games played in addition which gives us the players adjusted goal average, GAadj.

Name GA GAadj GAadj,LOESS Rank LOESS
1 RUS Oleg Salenko 2.000 1.971 1.965 1
2 FRA Just Fontaine 2.167 1.608 1.595 2
3 POR Eusébio 1.500 1.441 1.297 5
4 ARG Guillermo Stábile 2.000 1.374 1.236 6
5 CZE Tomáš Skuhravý 1.000 1.208 0.984 9
6 COL James Rodriguez 1.200 1.200 1.200 7
7 HUN Sándor Kocsis 2.200 1.092 1.479 3
8 ITA Salvatore Schillaci 0.857 1.036 0.843 12
9 ITA Christian Vieri 1.000 1.028 0.988 8
10 GER Gerd Müller 1.077 1.014 0.977 10
11 SUI Josef Hügi 2.000 0.992 1.344 4
12 BRA Leônidas 1.600 0.931 0.965 5

Hmmmmmmmm … There are elements of this list that I like (Fontaine, Eusebio and Stábile high up, good mixture between older and newer players) , but two things totally discredit it to me: the Bomber is only in 10th place for both rankings. If it comes to pure goal scoring ability, I think one would be hard pressed to find any better player. So, a #10 ranking just does not look right. And then of course the player on top! Thinking about it, it was obvious that Oleg Salenko would rank very high: a high GA in only a single World Cup, which also has a relatively high GV. And while I could see an argument for Salenko having the best single game goal scoring achievement in his 5 goal game against Cameroon (if we forget the shambolic defense and poor motivation of the Africans), I truly have a hard time crowning Salenko as the best World Cup goal scorer of all time. In addition to running up the score on an inferior opponent, these goals also came in a game with nothing to play for.

There are also smaller things, that I do not like. In particular, Skuhravý is ranked higher than Schillaci. Both played in the same tournament (Italia 1990), but Skuhravý’s goals came in tow games: 2 in a 5-1 rout of the USA and 3 in a 4-1 rout of Costa Rica in the second round. Compare that to Schillaci, who scored in 6 different games and 4 of these were game winners and 1 the go-ahead goal in the semi-final! In every regard, Schillaci’s performance was more impressive.

Now, comparing the GAs based on raw data and LOESS smoothing, I do like the raw data better as is mixes the different World Cup periods slightly better. But again, I think that overall both adjusted lists do not satisfy me.

Also, the problem with some players racking up goals against weak opponents, while others consistently score important goals is also present in the previous ranking of adjusted total goals. And while it is easy to knock on Salenko, the same argument can be applied to certified super striker: Gabriel Batistuta. 10 goals surely look impressive, but 3 of these came against a weak Greece team in 1994 and 3 more were add-on goals against a similarly out of sorts Jamaica. That leaves 4 more goals, two of which were penalties in the second round. Batigol was one of my favorite players of the late 90s, but this is not very impressive. Compare this to Eusébio’s 9 goals, where Portugal had to play in the group of death with Brazil, Hungary and Bulgaria, Eusébio had to bring back his team from 0-3 deficit against the pesky North Koreans and then lead Portugal to a third place finish with 2 more goals …

So, while I still think that adjusting for tournament GV is a good and necessary step, there is still some way to go to find a satisfying list taking care of the problems inherent in both of the above. So, to improve the listing, I come to three conclusions:

  1. We have to consider not only the fact that a goal was scored, but also how important that goal was. Scoring the third to fifth goals in a 5-0 rout is a nice feat, but all these goals were not important in securing the win. Conversely, scoring three game winning goals in three consecutive games is a highly impressive streak. Also, scoring the game winner in the World Cup final is more important/valuable than doing so in the group stage.
  2. Relative performance is a nice measure, but I do start to prefer the absolute performance. While on a very small scale, the Skuhravý vs. Schillaci comparison highlights the shortcomings of relative performance. If we also take importance into account, I think a player scoring many important goals over a larger number of games should be considered a greater goal scorer than one scoring two game winners in two games.
  3. The smoothed curve was a nice idea, but the raw data so far led to more satisfying results. I will keep comparing the two adjustments methods, but as of now advantage raw data

Well, the journey has just begun and I hope to get you deeper into World Cup goal scoring history and some statistics in the next post.

In Search of the Greatest World Cup Goal Scorer – Part I: Using Raw Numbers

One of the most celebrated individual statistics at the World Cup is the scoring leader and by extension the overall scoring leader. And this overall scoring leader is what we want to look at here, although by way of getting there, we also have to look at the top scorer at each World Cup. So, let’s look at what is the all time scoring list

Name Goals Games Avg. Tournaments
1 GER Miroslav Klose 16 24 0.67 2002, 2006, 2010, 2014
2 BRA Ronaldo 15 19 0.79 1994, 1998, 2002, 2006
3 GER Gerd Müller 14 13 1.08 1970, 1974
4 FRA Just Fontaine 13 6 2.17 1958
5 BRA Pelé 12 14 0.86 1958, 1962, 1966, 1970
6 HUN Sándor Kocsis 11 5 2.20 1954
GER Jürgen Klinsmann 11 17 0.65 1990, 1994, 1998
8 GER Helmut Rahn 10 10 1.00 1954, 1958
ENG Gary Lineker 10 12 0.83 1986, 1990
ARG Gabriel Batistuta 10 12 0.83 1994, 1998, 2002
PER Teófilo Cubillas 10 13 0.77 1970, 1978, 1982
GER Thomas Müller 10 13 0.77 2010, 2014
POL Grzegorz Lato 10 20 0.50 1974, 1978, 1982

End of discussion? Miroslav Klose is the leading scorer of all time! Not so fast! This list is merely a starting point for me. 🙂 But before delving into further analysis, let’s just applaud Germany for having five scorers in the top 10 (which is actually a top 14 due to the tie in 8th). Brazil has two all-time greats high in the list, but not the density below. Also surprising is the fact that despite many big names, there really is no Italian listed here (Christian Vieri, Paolo Rossi and Roberto Baggio all have 9 goals). But then, you don’t think about goal scoring prowess when you think about Italy, do you?

Now, what is wrong with that list? As we will see quite a few things with the most obvious fact that of course you are bound to feature high on the list if you play a lot of games at a World Cup. And while I do not want to discredit Klose, his tally is mostly due to longevity. Same goes for Klinsmann and Lato. Compared to G. Müller, Fontaine and Kocsis this seems to be a slightly unfair listing. And longevity is not the only factor here, but also that due to the increase in teams at the World Cup more games are played in general. The maximum number of games in 1954 were 6 if you had a qualifying playoff, but Hungary only needed 5 as clear winners of their group (more on that odd format here). From 1958-1970, the finalists all played 6 games and only since do finalists play 7 games. So, obviously a better measure  would be a relative performance measure such as goals per game. Let’s see:

Name Goals Games Avg Tournaments
1 HUN Sándor Kocsis 11 5 2.20 1954
2 FRA Just Fontaine 13 6 2.17 1958
3 ARG Guillermo Stábile 8 4 2.00 1930
SUI Josef Hügi 6 3 2.00 1954
RUS Oleg Salenko 6 3 2.00  1994
6 BRA Leônidas 8 5 1.60 1934, 1938
7 POR Eusébio 9 6 1.50 1966
8 BRA Ademir 8 6 1.00 1950
9 URU Pedro Cea 5 4 0.83 1930
ITA Silvio Piola 5 4 0.83 1938
HUN Gyula Zsengellér 5 4 0.83 1938

Disclaimer first, before we discuss this listing: Only players with at least 5 overall goals and 3 games are considered. Why? Because I feel to be considered the greatest World Cup goal scorer, you should score either a sufficient amount of goals and have played in a few games. 5 and 3, respectively seem good thresholds. Or just look at FIFA’s listing and you will see plenty of players from the 1930s with only one or two games showing up. This, in my opinion, taints the goal average.

What do we observe? Do we recognize any of these? Well, as a World Cup nut all these names are not foreign to me, but looking at when these players achieved these scores, there is a clear bias towards the early days of the World Cup. The most recent one is Oleg Salenko who achieved his incredible tally mostly in a 5 goal rout of a desolate Cameroon team in 1994. The other players mostly benefited from the fact that in the early days of the World Cup more goals were scored.

So, both are valid lists, but both are not perfect. The list of total goals scored hands an advantage to players who play more games and thus played more recently. The other not only benefits players who score a lot of goals in just a handful, but also hands an advantage to players from the early days. So, we need to find a balance and that is what I will try to do over the next few posts.

All the footballers in this post a worth checking out and I have provided Wikipedia links. Especially the ones from the first 10 (or so) World Cups often come with great stories attached to them. One of my favorites is that the top scorer of the first World Cup in Uruguay, Guillermo Stabile, played for Argentina only at that tournament – never before, never after! And, he scored in every single one of his appearances.