FIDE vs Chess.com ratings explained

GagarinGambit

Seeing that there's much talk over how Chess.com ratings correspond to FIDE ratings, I got curious and, as I happen to have a few skills in statistics, I decided to look into it.

Thus I spent a few hours getting a random sample of 121 cases, the vast majority found by googling
site:http://www.chess.com/members/ "fide rating"
which brings numerous profiles of players who have posted their FIDE rating. Most of them tend to be titled players, so I was careful not to include too many 2000+ rated players, as then the results would be biased due to the lack of average rated players. Then, I recorded the FIDE ratings alongside the chess.com bullet, blitz, standard, online and tactics ratings; I did not record ratings when there were not enough games present (at least 20), because they'd be unreliable.

Obviously, not all players were rated in all five chess.com indicators; among the 121 profiles explored, 86 were bullet rated, 113 were blitz rated, 79 standard rated, 95 online rated, and 88 tactics rated; in total, only 42 of them were rated in all five aspects.

Granted, 121 players is not a sample big enough for a proper investigation, but it's enough to get the basic idea, and I didn't intend to spend days getting my sample. But that's enough preliminary, let's get to the results, starting with the means.

 

ELO Means
                                      Mean      Std. Deviation
FIDE rating                      1769             287
Chess.com: Bullet          1628              313
Chess.com: Blitz            1685              273
Chess.com: Standard     1615              195
Chess.com: Online         1804              301
Chess.com: Tactics        2036              404

Although these are not directly comparable, we can make a few observations.

  1. First, all three chess.com live rating means are lower than the FIDE ratings; thus, on average, chess.com live players are underrated by 100-150 points.
  2. On the other had, online ratings are higher than fide ratings; so, on average, chess.com online players are overrated by 50 points.
  3. When it comes to the tactics training ratings, overrating is extreme; on average, players have a tactics rating about 250 higher than their FIDE rating. In addition, note that the standard deviation of the tactics ratings is very high compared to the others, which means that tactics ratings are all over the place, and that's a hint that the tactics ratings are not reliable indicators of a players strength.


But it's more complex than this. For starters, just by looking at the data, it's obvious that different players perform different at the various ratings. It's quite common for players to perform better under long time controls (online>standard>blitz>bullet), and other players are at their best at fast time controls (bullet>blitz>standard).

 

So let's have a look at the correlations.

 

ELO Correlations
                    Bullet     Blitz   Standard  Online  Tactics
FIDE rating    0.563    0.738    0.573     0.675    0.640
Bullet                -       0.786    0.167     0.205    0.671
Blitz                              -        0.647     0.531    0.737
Standard                                    -          0.541   0.521
Online                                                       -       0.534

(pearson correlations, cases excluded pairwise, almost all correlations are significant at the 0.001 level)

 

If you're not familiar with correlation, it's sufficient to know that it's a number ranging from 0 to 1 showing the degree two variables are related; 0 means no relation at all, and 1 means that the two variables are identical.

 

FIDE rating is, as expected, highly correlated with all chess.com ratings; obviously, your chess.com rating is not independent of you OTB rating. The highest correlation is found among the FIDE and Chess.com blitz rating, meaning that the blitz rating is the one that tends to follow more closely the FIDE rating (although, as we shall see, it tends to be lower). In particular, the 0.738 correlation means that 54% of the chess.com blitz variance can be explained by the FIDE rating (and the rest 46% needs to be explained by other factors, such as your performance at different time controls, the different nature of OTB and internet chess, or your level of activity).

In my opinion, the most important factor in explaining the deviations between FIDE and chess.com ratings is the time controls. This is made obvious by the chess.com bullet ratings which are NOT correlated with the longer time standard and online ratings, while all the other ratings are highly correlated.

 

In plain words: some people need time to think, and thus they perform better when they have enough time (FIDE, standard, online); others are quite fast and don't gain much by having extra time and are at their best when playing bullet and blitz games.

 

Finally, I'll present the linear regressions, treating FIDE as a depended variable (estimating your FIDE rating from your chess.com ratings). Although it's not quite "proper", I'll present them as equations, where you can enter a chess.com rating and get an estimated FIDE rating.

 

Regression

996 + 0.474 * Bullet = FIDE rating (+-219)
483 + 0.769 * Blitz = FIDE rating (+-193)
594 + 0.702 * Standard = FIDE rating (+-197)
737 + 0.571 * Online = FIDE rating (+-189)
902 + 0.438 * Tactics = FIDE rating (+-214)

 

A couple of warnings. First, don't bother inputting your ratings if your chess.com ratings are lower than 1200, since almost all players of the sample were rated higher (that's because low rated chess.com players tend not to have FIDE ratings, of course). Second, the number in parenthesis is the standard deviation; thus, your estimated FIDE rating is within the range of plus/minus the standard deviation. The deviations are too high (especially when it comes to bullet and tactics, as I mentioned fast time controls are the least reliable), but if you do this for all your ratings you can get an idea. More data would give more accurate results, but I simply don't want to spend days on this.

An example.

At the moment, my standard chess.com rating is 1452. According to the formula given by the regression, this translates to 594 + 0.702 * 1452 = 594 + 1019 = 1613 FIDE, plus/minus 197 (thus, at the range of 1416 to 1810). My tactics trainer rating is 1461, and thus 902 + 0.438 * 1461 = 902 + 640 = 1542 FIDE plus/minus 214 (1328-1756). Indeed, my real FIDE rating is 1504, which is quite close to these estimates.

But the regression outcome is not useful only in estimating FIDE ratings, but it also explains how they are related at different strength levels, because chess.com elos are overrated at some levels and underrated at others.

So, let's go over the regression estimates, supposing 1500, 1800, 2100, and 2400 ratings (I don't include 1200 rating estimates because they are not quite accurate due to the very small number of the sample players at this range).

Bullet:
1500 chess.com -> 1707 FIDE
1800 chess.com -> 1849 FIDE
2100 chess.com -> 1991 FIDE
2400 chess.com -> 2133 FIDE


Blitz:
1500 chess.com -> 1636 FIDE
1800 chess.com -> 1867 FIDE
2100 chess.com -> 2097 FIDE
2400 chess.com -> 2328 FIDE

 

Standard:
1500 chess.com -> 1647 FIDE
1800 chess.com -> 1858 FIDE
2100 chess.com -> 2068 FIDE
2400 chess.com -> 2279 FIDE

 

Thus, all live ELOs tend to be underrated up to the 1800-1900 point; the deviations are particularly large when it comes to bullet ratings.

 

Online:
1500 chess.com -> 1593 FIDE
1800 chess.com -> 1764 FIDE
2100 chess.com -> 1936 FIDE
2400 chess.com -> 2107 FIDE

 

On the other hand, online ELOs starting from 1650 or so are overrated.

 

Tactics trainer:
1500 chess.com -> 1559 FIDE
1800 chess.com -> 1690 FIDE
2100 chess.com -> 1822 FIDE
2400 chess.com -> 1953 FIDE

 

Tactics trainer ratings are overrated. Highly overrated. Period.

In summary, club level players will tend to have lower live ratings but higher online ratings than their official FIDE ratings; your tactics ratings will almost always be overrated, often by several hundred points. But in practice, this will depend on the player, his/her play strength at different time controls, tactical skills, the way he/she takes internet chess vs OTB chess, the amount of effort put into online chess etc.

Moreover, all rating systems are different rating systems. What I presented are only rough estimates which give us an idea of the particularities of chess.com ratings.

And now I'd better get back to playing chess :)

GagarinGambit

BTW, if anyone who know statistics is interested to look into this further, the related SPSS file is available at https://dl.dropboxusercontent.com/u/31982038/ChesscomRatings.sav

Bob_stew

Brilliant! Thanks for the effort, very interesting stuff. (Not least, the statistical analysis itself!)

Tapani

FIDE ratings (of lower rated players) are often on the lower side compared to performance. Young (and other) players tend to improve faster than their rating, due to shortage of FIDE rated events.

movavg1

Thanks for your analysis

Arun_1986

Excellent Work! Somethin I have been looking for quite some time. If it helps, my data confirms to the findings as well :).

FIDE-1701; Online Chess.com - 1786; Blitz - 1685.

Darshan_Haragi_L

nice forum...

My FIDE rating is 1404. But my

Bullet rating-1925

Blitz-1692

standard-1701

online chess-2061

tactics-2700s ansd 2800s

cool isn't it?

rvchess777

Very good article.

But my rating of 1560 does not correspond to my rating on chess.com

1587-bullet

1717-blitz

1761-Standard

1904-Online

1873-Tactics

If I use your calculations, I get a rating of about 1800 for all.(Note that i've been here for only a month and a half or so)

This doesn't add up.

Does the country in which you play affect your rating?, because I've seen that in India, many people are underrated in their FIDE ratings compared to their chess.com ratings.

You could do a country-wise comparison if you want. Just some food for thought.

tliu1222

Is there a difference in FIDE rating and USCF? I'm only in USCF.

Petrosianic

fide rated tournaments are only the top section in major events so in the US is less indicative of strength unless established at NM or higher [2200+]

Sunshiny

"Granted, 121 players is not a sample big enough for a proper investigation, but it's enough to get the basic idea"

No. If the sample size is small, it is not large enough to form a conclusion. 

tliu1222

http://www.chessmaniac.com/ELORating/ELO_Chess_Rating.shtml

Only a rating estimator....

AKJett

FIDE 1578 chess.com blitz 1577 online 1620

PLAVIN81

Thanx for the info

JMB2010

Very interesting stuff.

ItsEoin

Looks to me like the best thought out / most accurate statistical analysis we've ever had on the topic. Nice work.

GagarinGambit

Thank you all for your replies :)

rvchess777 I suspect that some players, especially in particular countries, are underrated when it comes to FIDE elos, due to the few opportunities of playing FIDE rated events, as also Tapani mentioned (thus, one can improve fast, but the rating will take much time to catch up).

 

_36darshan--, you seem to be a typical case of a player who thinks and plays really fast as your bullet and tactics ratings are very high; also, online chess elos are overrated according to my data, and thus your online rating is actually comparable to your blitz and standard ratings. I bet that usually you don't think for a move for more than 30-60 seconds even when playing OTB games. But in any case, you're clearly underrated - maybe it's the reason above? You're both from India after all.. but I also see you're both students, and it's very typical for young chess players to be underrated.

 

tliu1222 I didn't mix up FIDE and USCF elos on purpose, because they're different rating systems. But, as far as I know, USCF ratings tend to be slightly higher than FIDE ratings.

 

Sunshiny Actually, my findings are statistically significant at the 0.01 or 0.001 level, and thus I'm confident at least for the key findings (eg live underrated, online overrated), although because of the large deviations I'm not completely sure about the extend of the under/overrating.

StMichealD

wht about chess mentor?

btickler

So, these are self-reported FIDE ratings you are matching up?  

I think you left off the most important conclusion, then, and the one that might damn your other conclusions:  players (on average) will inflate their own ratings when asked what they are.  

Nobody has much incentive to report a lower rating than they actually have, so when somebody does lie about it, it skews things heavily and always on the inflation side.  One person reporting 1900 instead of 1400 is like 10 people inflating their ratings by 50 each...those lies add up to very inaccurate numbers very quickly.

Ok, on to your 3 conclusions:

  1. First, all three chess.com live rating means are lower than the FIDE ratings; thus, on average, chess.com live players are underrated by 100-150 points.
  2. On the other had, online ratings are higher than fide ratings; so, on average, chess.com online players are overrated by 50 points.
  3. When it comes to the tactics training ratings, overrating is extreme; on average, players have a tactics rating about 250 higher than their FIDE rating. In addition, note that the standard deviation of the tactics ratings is very high compared to the others, which means that tactics ratings are all over the place, and that's a hint that the tactics ratings are not reliable indicators of a players strength.

1. Your "thus" does not follow and is not the only conclusion you can draw from this data

2. You are comparing FIDE ratings in shorter time controls to correspondence chess ratings, which is no more valid that comparing blitz ratings to online ratings

3. Tactics trainer ratings are complete garbage anyway and mean nothing.  You can't compare a rating given based on arbitrary tactics problem rankings and a clock average for each problem that is heavily skewed by cheating to ratings achieved via an elo-style system...these numbers are not even remotely comparable.  It's like taking a bunch of economic data on "dollars" and combining US and Canadian dollars into some erroneous result because you consider them the same.  Tactics trainer ratings are just a laboratory rat pellet/lever mechanic to draw players to try to excel in something that looks like it has a direct correlation to something real, but doesn't. 

This seems like a textbook example of how not to apply very specific formulas to non-specific data.  It's like those customer service analysis reports companies pay money for that come back and tell them they have 95.71% customer satisfaction.  You can't take fuzzy data and then run it through math formulas, crank out a result to 2 decimal places, and pretend it means anything. 

papagar

Thanks for the article. I was just wondering about it in last days. By the way, contrary to bticker said, I also think that there are a correlation among all these rankings.