
Who Plays What: a Statsitical Analysis of Chess.com Data (part 1)
Hi all,
I am back today with a somewhat different post where I analyze what openings are played by who and with what success rate on chess.com.
Like a most avid chess player, I tend to notice patterns on openings while playing: "I feel like I am seeing more Sicilians now that my elo improved," or "the London is more popular now, isn't it?"
But since I know human beings are prone to dream up patterns over noisy data, I don't trust myself. And, since I love to waste time and happen to have some coding skills, I decided to test some of my hypotheses.
As you will see below, I found some interesting and logical patterns which, I hope, you will find entertaining if nothing else.
Data and Methodology
To get a representative sample of data, I used the chess.com API to download a total of roughly 1,200,000 games from 800,000 different players.
First, I downloaded the participant list from the last 80 10|0 arena tournaments on chess.com. I chose this tournament to maximize my chances of finding rapid players who interested me more than blitz players.
Next, I downloaded the games played by these players over the last year. Since the downloading process was slow, I stopped it when the results became somewhat stable. For each game, I saved the elo of both players, the game result, and the ECO code, which allowed me to label the openings of each game.
Finally, I removed all the games with blitz or lower time control to keep only rapids in my sample.
On the methodology side, I computed summary statistics, graphs and estimated some rigorous statistical tests. However, I decided to keep the more complex stuff for myself to make my post readable to a general audience.
For all the graphs below, keep in mind that, by construction, I have more data for middle ranges elo (750 to 1500) than for low high elo. So the results are less statistically significant (trustworthy) for these extreme elo values. The same can be said of rare openings and even more of rare openings at rare elo.
Results 1 - first moves
The first figure below shows the frequency (in percentage, so 0.8 means 80%) of opening per average elo of the two players.
As you can see, d4 is far less popular than e4 at all levels.
Interestingly, unprincipled openings like 1. b3 are popular at low levels but less so for experimented players.
The next natural question is: what about the performance of these openings? To that end, the figures below show the percentage of victory, draw, and losses for each opening.
(note that I only show the "success rate" if I have at least 100 observations for one opening and elo range)
One amusing pattern is the u-shape of the draw curves. I think this can be explained by low-level players not knowing how to mate, thus often finishing in stalemate, and high-level people falling to tactics less often, thus reaching more drawn endgames.
Mid-range elo players tend to win with white more often with d4 than e4. I think this can be explained by the frequency. Since d4 is seen less often, the d4 players have a comparative small advantage when playing white. Presumably, this "overate" them a bit for black, and they tend to lose more often with the black pieces (I tested this hypothesis with some fancy method and got a confirmation of my intuition).
This final figure below shows the "upset probability." This is the probability that lower-rated players win with a given opening. For example, if you see 200 on the x-axis, black had 200 elo points above white. The corresponding value on the y-axis is the probability that white wins nonetheless.
As one would expect, the probability of upset diminishes with the elo difference for all openings. Interestingly the best opening for upsets with white is the English openings. Presumably, this very sound but the somewhat rare first move gives white a "comfort" advantage that helps him defeat his higher-rated opponent. Note that this is not true for unprincipled openings like the bird or 1.b3, which often relies on superior tactical skills.
Results 2 - Number of systems
Now that we saw which systems for white are popular, I thought it might be fun to look at the number of systems used.
For each player with at least 50 games in my sample, I say that a player plays a "system" if he falls into one of the first move categories described above in at least 20% of his game with white.
The figure below shows the percentage of players with one, two, or three systems (nobody seems to play 4 systems in my sample).
As you can see, almost 80% of the players stick to one system, while roughly 20% choose to play two, and only a brave (or foolish) few try three different first moves. But is it efficient? Do we see a statistical difference between those three groups? The answer is kind of.
The figure below shows the distribution of the elo for each group. For those unfamiliar with these graphs, they are called "boxplots" and are much less scary than they look. On the y-axis, you have the elo of the white player, while on the x-axis, you have the number of systems played as white. The horizontal bar in the middle of the box shows the mean elo of the group. The colored box shows the 25th and 75th percentile, meaning that 50% of the players are within that box. The lower and upper vertical bar called "mustaches" show the 1st and 99th percentile, meaning 98% of the sample is within the mustaches. Finally, the little dots show the extreme points that are outside of this 98% bulk.
I think we see a pattern emerging. The more system you play, the less elo you have. You can clearly see that the mean (middle bar) goes down with each additional system played. For those of you who are curious: I did test it statistically, and under reasonable distribution assumptions, the difference between groups is significant.
This confirms something many people (including me) claimed about openings: if you are not a master, you should focus on a select few sets of openings, especially with the white pieces. Get familiar with it and understand the concepts. Getting diversified too early may slow your progress down.
That being said, the distribution clearly indicates that some players reach very high elo by playing two or even three systems. So you shouldn't force yourself to play only one move if you really don't want to. But I would advise against it, and so would the means of my boxplots.
To be continued!
That's it for today! I did run additional analysis on the popularity and efficiency of black's response to e4 and d4, but I'll keep them for another post if people ask for it. I wasn't sure this more nerdy and satistical post would find his audience, and, even if it did, I wanted to keep it short~ish.
I hope you enjoyed it, and until next time, happy learning!
Post-Scriptum
Even though I just said I would wait, I can't resist sharing this last results about the popular answer to 1. e4. It's simply the frequency of answers per elo:
Quite a few results are fascinating to me here, but none as much as the upward trend of the Sicilian. The higher the elo, the more Sicilians you see. The same is true for the Caro-Kann but with a much lower maximum frequency. But as a promise, I leave the rest of the 1.e4 analysis for another post!