In the 1980s championship, UEFA introduced a group stage to the finals, with 8 teams split in to 2 groups, from which the best team qualified directly to final and the runner ups played for 3rd place. In 1984 UEFA introduced a semifinal round. This model continued until 1996, where the number of teams doubled, and 16 teams where split in 4 groups. Furthermore, points for a win increased from 2 to 3. This model is still alive, but in France 2016, 24 teams will enter the tournament in 6 groups, which means that not the same number of teams from each group, will qualify for knockout phase.
A group of 4 teams is decided after only 6 games, each team playing 3 matches, one against each group member. The possible outcomes are therefore limited to 3^6 = 729 different outcomes. But in which of these outcomes does the goal difference play a role. Well, whenever 2 (or 3) of the 3 teams with most obtained points in each group are equal on points and the number of points obtained in the matches played between the teams in question are equal, then goal difference is important.
So we need to do one analysis for each of above models, to answer the question firmly. However, I will only do the analysis to the current model. That is, on the model introduced in 1996, which will be outdated in France 2016. And I will simplify the analysis by assuming that all teams are perfectly matched in the sense that the probability of 1, X and 2 in each match is 1/3. Furthermore, I will only investigate when goal difference is important to qualification to the knockout stage, that is, I will only look at the importance of goal difference to decide who will finish top two in the group, the order of the top two teams will be ignored (of course the order is important, because the group winner won’t be pared with a group winner in the first round of the knockout stage).
In the table below I have listed all final group standings (on points), in which goal difference can decide who will finish top two in the group.
The sum of the frequencies are 111, so under the assumption of perfectly matched teams, the probability of a final group standing in which goal difference is important is 111/729 = 0.1523. That is, in one out of 7 groups, goal difference will decide who will qualify for the knockout stage of the European Championship, if all teams are perfectly matched.
How does this agree with the empirical statistics up until now? Well, since 1996, 16 groups has been formed and finished, and 3 times (see the table above) has goal difference been decisive for second place in the group. That is, the empirical statistic (0.1875) is quite close to the theoretical probability under the assumption of perfectly matched teams (0.1523), which perhaps could indicate that the assumption of perfectly matched teams is a quite nice simplification of this exact problem.
Tonight, group A of this year European Championship will be decided. There are 9 different outcomes of the 2 remaining games, and the only outcome which will make goal difference important, is Greece-Russia 1 and Czech Republic-Poland X. Then the final standing is 4-4-4-3, where Poland has drawn against all opponents and Russia, Greece and Czech Republic has won one each.
Sunday, group B will be decided, and two outcomes will make goal difference important, Portugal-Netherlands 1 and Denmark-Germany 1 (6-6-6-0) or Portugal-Netherlands 2 and Denmark-Germany 2 (9-3-3-3).
Monday, group C will be decided, and the only outcome which will make goal difference important, is Croatia-Spain X and Italy-Republic of Ireland 1 (5-5-5-0).
Tuesday, group D will be decided, and the only outcome which will make goal difference important, is England-Ukraine 2 and Sweden-France 1 (6-4-4-3).
The source for the Euro results is www.uefa.com. The frequencies in the table should be recalculated if needed in any analysis of importance.
A group of 4 teams is decided after only 6 games, each team playing 3 matches, one against each group member. The possible outcomes are therefore limited to 3^6 = 729 different outcomes. But in which of these outcomes does the goal difference play a role. Well, whenever 2 (or 3) of the 3 teams with most obtained points in each group are equal on points and the number of points obtained in the matches played between the teams in question are equal, then goal difference is important.
So we need to do one analysis for each of above models, to answer the question firmly. However, I will only do the analysis to the current model. That is, on the model introduced in 1996, which will be outdated in France 2016. And I will simplify the analysis by assuming that all teams are perfectly matched in the sense that the probability of 1, X and 2 in each match is 1/3. Furthermore, I will only investigate when goal difference is important to qualification to the knockout stage, that is, I will only look at the importance of goal difference to decide who will finish top two in the group, the order of the top two teams will be ignored (of course the order is important, because the group winner won’t be pared with a group winner in the first round of the knockout stage).
In the table below I have listed all final group standings (on points), in which goal difference can decide who will finish top two in the group.
Standing | Frequency | Occurencies |
9-4-4-0 | 12 | |
9-3-3-3 | 8 | |
9-2-2-2 | 4 | |
7-4-4-1 | 12 | 1996 grp A |
6-6-6-0 | 8 | |
6-4-4-3 | 36 | 2004 grp A |
5-5-5-0 | 4 | 2004 grp C |
5-3-3-2 | 12 | |
4-4-4-4 | 6 | |
4-4-4-3 | 8 | |
3-3-3-3 | 1 |
The sum of the frequencies are 111, so under the assumption of perfectly matched teams, the probability of a final group standing in which goal difference is important is 111/729 = 0.1523. That is, in one out of 7 groups, goal difference will decide who will qualify for the knockout stage of the European Championship, if all teams are perfectly matched.
How does this agree with the empirical statistics up until now? Well, since 1996, 16 groups has been formed and finished, and 3 times (see the table above) has goal difference been decisive for second place in the group. That is, the empirical statistic (0.1875) is quite close to the theoretical probability under the assumption of perfectly matched teams (0.1523), which perhaps could indicate that the assumption of perfectly matched teams is a quite nice simplification of this exact problem.
Tonight, group A of this year European Championship will be decided. There are 9 different outcomes of the 2 remaining games, and the only outcome which will make goal difference important, is Greece-Russia 1 and Czech Republic-Poland X. Then the final standing is 4-4-4-3, where Poland has drawn against all opponents and Russia, Greece and Czech Republic has won one each.
Sunday, group B will be decided, and two outcomes will make goal difference important, Portugal-Netherlands 1 and Denmark-Germany 1 (6-6-6-0) or Portugal-Netherlands 2 and Denmark-Germany 2 (9-3-3-3).
Monday, group C will be decided, and the only outcome which will make goal difference important, is Croatia-Spain X and Italy-Republic of Ireland 1 (5-5-5-0).
Tuesday, group D will be decided, and the only outcome which will make goal difference important, is England-Ukraine 2 and Sweden-France 1 (6-4-4-3).
The source for the Euro results is www.uefa.com. The frequencies in the table should be recalculated if needed in any analysis of importance.
After the last 8 group matches, it turned out that goal difference was irrelevant for qualification in all 4 groups. That is, the empircal statistic is now 3/20 = 0.15.
ReplyDeleteAs mentioned in the post, I wasn't completely sure about the frequencies in the table. Therefore, I have written a VBA-program (in Excel) to simulate a group of 4 teams, with one game against each opponent. The result is quite satisfying.
ReplyDeleteI simulated the group fixtures 100.000 times and repeated the experiment 6 times. Here is the simulation with the overall count closest to the average
Simulation 5 --- Count --- Count*729/100000
T9-4-4-0 --- 1615 --- 11,77335
T9-3-3-3 --- 1187 --- 8,65323
T9-2-2-2 --- 521 --- 3,79809
T7-4-4-1 --- 1570 --- 11,4453
T6-6-6-0 --- 1143 --- 8,33247
T6-4-4-3 --- 5004 --- 36,47916
T5-5-5-0 --- 578 --- 4,21362
T5-3-3-2 --- 1603 --- 11,68587
T4-4-4-4 --- 805 --- 5,86845
T4-4-4-3 --- 1071 --- 7,80759
T3-3-3-3 --- 138 --- 1,00602
Sum(Count)/100000 --- 0,15235
Average for the standing count distribution normalized to base 729, can be interpreted as a numerical proof of the correctness of the frequencies. For example after repeating the simulation 100 times.