World Cup Last 16: Prediction via Simulation

-waller-

Sep 12, 2017, 4:10 PM | 5

Hi all,

So, a slightly different post for this blog. There's no chess analysis here! Instead, I'm attempting to dive into the world of chess prediction/modelling by coming up with some predictions for the World Cup. It's going to be pretty basic to start with; I'm hoping to build on this/modify it gradually for future tournaments, hopefully incorporating more information and becoming more accurate whilst still coming up with some fun predictions!

The World Cup 2017

As of the time of writing, there are 16 players still in contention in Tbilisi (ordered by rating):

2804 Maxime Vachier-Lagrave
2802 Levon Aronian
2792 Wesley So
2788 Alexander Grischuk
2777 Anish Giri
2771 Ding Liren
2756 Peter Svidler
2731 Vladimir Fedoseev
2727 Vassily Ivanchuk
2714 Bu Xiangzhi
2702 Baadur Jobava
2701 Wang Hao
2695 Maxim Rodshtein
2694 Evgeny Najer
2675 Richard Rapport
2666 Daniil Dubov

That's a nice number to get started with. I'm going to make percentage predictions for each of the remaining players to get through each of the 4 remaining rounds: Last 16, Quarter-Finals, Semi-Finals and Final. Also quite nice is the tournament structure; a knockout based on mini-matches of 2 games followed by a tiebreak. This should lead to some interesting interactions between the stage of the competition and a player's chances of reaching it eg. a player may have a greater chance of reaching the final if his side of the draw is easier!

Methodology

For this initial foray, I'm going to build a simulation model, actually simulating the results of the games played in the classical portion. My intention is to use 3 variables for each player:

Elo rating (from September 2017 list)
Draw % as White (historical from chesstempo database)
Draw % as Black (same)

This is pretty much a minimal set of information for simulating game results. Theoretical expected scores can be calculated using the rating difference between two players; but an expected score does not narrow down a win, draw and loss probability for a single game!

By way of example, suppose a player has an expected score of 75% against his opponent. This might refer to the fact that he has a 75% of winning, and 25% of losing. No draws! Or, very differently, it could refer to a 50% chance of winning, 50% drawing, and 0% losing. Very different! Hence, some measure of drawishness is needed.

To simulate a game, I averaged the White draw % of the White player and the Black draw % of the Black player to get a draw probability. Then, using that, the Elo expected score formula and the ratings of the two players, I calculated the corresponding "White wins" and "Black wins" probabilities.

That's pretty much it! I now simulated the entire rest of the World Cup 100,000 times (For tiebreaks, I simply used pure expected score - rapid/blitz of course is a different game in reality). Converting the simulations into percentage chances, here are the results!

Results

Below is a table of the chances I calculated for each player, ordered by their chances of becoming World Cup Champion:

We can see that there are some pretty interesting variations thrown up by the knockout structure of the event. Take Rapport, for instance, who by virtue of his relatively easy pairing against Najer has a 45.2% chance to get through to the next round, and compare this to Grischuk, who has a similar 45.7% chance of progressing against Vachier-Lagrave. Yet Grischuk is over 7 times more likely to become champion overall!

Discussion

Comments on my approach are welcome, especially criticisms. This is really just a first attempt which I hacked out in an evening! Particularly, I think experimenting with the methodology for predicting draws and even collecting the data in a different way might be important. It's interesting to note that because of the minimatch structure, the different drawing percentages with White and Black won't have much of an effect here; but in a round-robin type event, it'll be interesting to quantify the advantage conferred by an extra White game/White against specific opponents etc.

Thanks for reading!

World Cup Last 16: Prediction via Simulation

-waller-'s Blog