Statistics and Chess Improvement

  • GM Shankland
  • | Feb 21, 2012

Today I would like to share with you an exercise I did at the end of 2011 to try to prepare for the events I would play in the next year. I logged 70 FIDE rated games in 2011. This is a decent but not huge sample size, and I decided to do a thorough statistical analysis of my results to try to find spots where I was performing well and spots where I could be playing better. I’ll show some of the results and notable statistics here:

Results by Rating

My Score vs. Opposition rated under 2400 FIDE: 23 wins, 1 draw, 0 losses: 97.9%, Performance Rating 2960

My Score vs. Opposition rated 2400-2499 FIDE: 3 wins, 8 draws, 0 losses: 72.2%, Performance Rating 2540

My Score vs. Opposition rated 2500-2599 FIDE: 5 wins, 13 draws, 3 losses: 54.5%, Performance Rating 2595

My Score vs. Opposition Rated 2600-2699 FIDE: 1 win, 7 draws, 4 losses: 37.5%, Performance Rating 2546

My Score vs. Opposition Rated 2700+ FIDE: 1 win, 2 draws, 1 loss: 50%, Performance Rating 2728


From this data I could tell that the players I was having the most issue with were the players in the 2400-2700 range, and when I look back at the year, it makes sense. I was not beating 2400-2600 players as much as I should have, and I lost a couple tough games to 2600s while only striking back once. I was holding my own against the really big boys, especially considering one of my draws against 2700+ could be considered a win because I only agreed the draw to win my match in an elimination style format. However, this was only a sample of 4 games. I was quite happy that I was able to manage such an effective score against players lower than 2400.

Another thing I noticed was that while my score against 2400-2600 players left a lot to be desired, I was losing very rarely and the 3 that did occur were against 2590, 2590, and 2592- players who might be in the next category had I played them a month before or later. I decided to move those losses into the 2600-2700 category, and then I found that my score against 2500-2599 was 2656 while against 2600-2700 was only 2512. This ultimately led me to conclude that the players I needed to score better against were the 2400-2500 players and the 2600-2700 players. I decided to fix this by trying to play a bit less theoretically against the lower players to try to get them on their own earlier, and to try to be more aggressive against the stronger players because my solid play got me a bunch of draws but I got knicked for a couple of losses and did not manage to counter it with an equal number of wins. It appears this analysis and my new approach paid off so far: I present you my results against these rating ranges from my first tournament of 2012.

My Score vs. Opposition rated 2400-2499 FIDE: 3 wins, 2 draws, 0 losses: 80%, Performance Rating 2698

My Score vs. Opposition rated 2600-2699 FIDE: 1 win, 1 draw, 0 losses: 75%, Performance Rating 2830

Results by Opening

Another key part of the statistical analysis is to look at openings- this will give you a sense of what you need to study more, both in terms of opening theory and the ensuing middle games. This section was very detailed in my work, but I’ll present a more general version here:

Score with White in d4 d5 systems: 68%, 2557 Performance Rating

Score with White in Nimzo/Quid systems: 70%, 2648 Performance Rating

Score with White in KID/Grunfeld: 67%, 2591 Performance Rating

Score with White in other Systems: 88%, 2755 Performance Rating


Score with Black vs. 1. e4: 58%. 2549 Performance Rating

Score with Black vs. 1. d4: 65%, 2639 Performance Rating

Score with Black vs. Other first moves: 56%, 2542 Performance Rating


With this information I deduced that I mostly need to work on my black repertoire against non-d4 moves and my white repertoire in the Slav and QGD. There is a lot more to the statistical analysis I did, including the use of the serious but unrated games (US Chess League, tiebreak games, rapid games, training games), filtering by how many moves the games lasted (which measures fatigue and level of endgame play), breaking the year in half (In January-May I performed at a 2580 level, while June through December I was over 2630, suggesting I probably improved and the more recent games are more relevant), and much more.

I would suggest to any reader, even those few who are not professional players, that statistical analysis can be an excellent way to examine your own play, and I would suggest breaking the analysis down by rating range and opening. Then, once you have determined a weakness, look at all of your games against this rating range or opening, including the wins, to determine what you might be doing wrong and how to improve it. And always keep in mind- a small sample of games will not have the same accuracy as a large sample. Lastly, I should point out this analysis would have been extremely difficult to do without the help of Chessbase, and I highly recommend that everyone buy this software, for statistics as well as opening preparation, and engine analysis. I know that I was much happier with my training regimen after doing a thorough statistical analysis of my own results, and so far it has paid off in my only 2012 event. I look forward to seeing if it can continue to pay dividends and I hope you find it useful as well.


  • 5 years ago



  • 5 years ago


    Fascinating stuff, thanks for the insight.

  • 5 years ago


    The thing is, 70 data points makes for a paltry sample size, and it's very easy to be misguided by apparently meaningful correlations.  If you broke down your results by windy days vs. calm days, or days when you wore blue vs. days when you wore white, you might find a similar amount of variance in your results.  You could be tempted to read meaning into such results.  If your statistics can't lead to a plausible and testable hypothesis, which can be shown to accurately and consistently predict future results through further experiment, then the analysis can add up to worse than nothing.  It can fool you.  Can you isolate all the variables?  There is a sea of variables to consider.  Meanwhile, opponents cannot be relied upon for any great degree of consistency.  The issue is not so much that a 2500 player can be better on some days and worse on others.  If you had a large sample against that particular player, that kind of variation would make little difference.  But, more significantly, one 2500 player can be very different from another 2500 player, in ways that are very difficult to quantify, in ways that are not reflected in the raw number of their rating.

    As Mike Caro has pointed out in the case of poker, you may find relationships such as three players, A, B, and C, who are theoretically similar in ability, compared to the mass of all players, but amongst themselves they can have a rock-paper-scissors kind of relationship.  A beats B, B beats C, C beats A... routinely.  It would be nice to see how your particular skills match up in this way to particular types of players, and then try to determine the "why" of it all, but none of that will show in a small sample analyzed on one or two dimensions.

  • 5 years ago


    Think I need to work on getting the 70 games before beginning any statistical analysis Undecided

  • 5 years ago


    Weaker player like us might be happy if we find the best move.

    But actually, most of the time, it's only one of the best moves. One move might be more aggressive, one move more positional, an other one might be more forcing...

    It's understandable that a stronger player wants to make some fine tuning to choose a specific approach against different types of players.

    As far as I am concerned, statistics are no fun and I don't feel like I need them in order to improve because I am bad in all areas of the game. So I just look at what I like. Which could be described as reinforcing my "strong" points.

    [Edit, in reaction to Gizehks-Practitioner: I am no city champion but be careful, whoever you are, I could checkmate you]

  • 5 years ago


    Is this a brag post? 

    I strongly agree with Frankdawg. 

    The only useful bit of information in this post is your 'results by opening', but even that is very modestly useful at best.  What happened to playing the best move ??

  • 5 years ago


    Well done mr. Shankland the insite that you provide into a how a high ranked player uses his time to prepare to play a game is much appreciated Cool ! thank you very much

  • 5 years ago

    GM SultanOfKings


    I started looking over some things, but lost interest fairly quickly. I can post my results when I get home - but I didn't actually manage to do it as well as Sam. I gathered my score against the different rating groups, but didn't actually check rating performance.

    What I do remember is:
    - I had a very poor year
    - I score (relatively) poorly against 2400-2499
    - I played many more 2600+ than Sam

    To Sam:
    Send me a message on how you structured the data to get both rp and individual scores!

  • 5 years ago


    This is good stuff.  You might want to consider the variance of these numbers more carefully though.  You have taken an important step to recognize that bigger sample size is better, but you want to try to be specific about how accurate these numbers are for your sample size before making big adjustments to your approach to the game.  I could offer suggestions if you're interested.

  • 5 years ago


    +2600 players are very tough. And +2500 can also be. Just ask Nepomniatchi or Vallejo about their Aeroflot experience. I guess your first 2012 tournament wasn't Aeroflot. At the end, is all just a simple matter about choosing the right event.

    And congratulations for your recent success at ITT Northern California.

  • 5 years ago


    actually this whole process is helpful for all levels. If you can help determine if your losing games in a particular opening or stage of the game you can know where to place your energy during study. Are you missing 'reasonable' tactical shots?  (missing a 8 move combination or tactic can  be level-skill dependant too.

    Are you losing drawn endgames or draw winning endgames against your level opposition can pinpoint your weakness. Just get better at everything is a joke. Ignore the facts your choice.

  • 5 years ago

    NM BMcC333

    How can a performance rating go over 2800 for playing people under 2400?

    Is there some new system?

  • 5 years ago


    while it is true that statistics can be misleading i think at the very basic level statistics can't be disputed. for example of you play 100 games, 50 with d4 and 50 with e4 and  you lose 90% of your d4 games but win 90% of your e4 games it cannot be disputed that you don't do well with d4!

  • 5 years ago


    i think for the lower rated players the analysis need not be as detailed.  perhaps ignore the ratings in the analysis but focus on understanding which opening variations result in more losses and focus on understanding our weaknesses in those specific openings. 

  • 5 years ago


    "80% of all statistics are made up."

    To this I add:

    -There are three kinds of lies... lies, damned lies, and statistics. (not sure of the origins of this, but I hear it every now and then)

  • 5 years ago


    For those of you who see these types of analyses as useless, worthless, pointless, etc., the common theme in your posts seems to be general ignorance (with the exception of the point regarding fluctuations in ratings, which is a rather interesting one).  It's extremely unfortunate.  For those of you who do not see such analyses as feasible, they are actually rather simple, and would take no more than 5-10 minutes to get similar details regarding your performance if you regularly update the results of your games in some sort of spreadsheet.  Also, what is included in this article is only a piece of what you could do with the data that the GM seems to have put together.

    And please, if you are going to post negative comments about the article, at least learn a bit about statistics.  Or at least explain your logic with more detail... making sure that each of your premises are true.  Any child can say that the something is worthless, but you would be a fool to take him or her on his or her word.  As we grow, so does our ability to think in more complex ways and to articulate our thoughts more clearly.  Give it a try!

  • 5 years ago


    80% of all statistics are made up.

  • 5 years ago


    Maybe it's just me, but I don't take chess seriously enough to do this in depth analysis.  Come on people, it's just a game!  (This will probably get me booted out of

  • 5 years ago

    WGM Natalia_Pogonina

    I never perform such breakdowns myself, but my manager (based on data analysis for the last few years) has been telling me that I tend to do much better against same or higher-rated opposition than against weaker players.

  • 5 years ago


    I found this to be very helpful to a certain degree, but I think I will stick with my own analysis for right now. Good article Sam

Back to Top

Post your reply: