Ratings Comparison: Chess.com Blitz v USCF OTB

Dec 17, 2010, 5:50 PM 21

(Personal opinion below, and does not necessarily represent the views of chess.com management or anyone else with authority or insight.)

Hardly a day goes by in Live Chess Main Hall chat without at least one member asking some form of the following question: "How does my (insert rating number here) Chess.com Blitz rating compare to an over-the-board or official USCF rating?"

Good question. The unsatisfying but truthful answer is that since the player pools, ratings formulas, and playing conditions are not the same, no universal and direct comparison is fair, even if possible. Nevertheless, we all know that strong players in the over-the-board world usually translate to strong players here at Chess.com as well. So, is there anything at all that can be said quantitatively?

Some years back, a not-to-be-named competitor conducted a survey amongst its players to try to put a numerical spin on this off-asked-but-seldom-satisfyingly-answered question. Results of that survey indicated that - on average - online Blitz ratings were roughly 100 points higher than USCF ratings.

So, the $64K question is this: Is that also true here at Chess.com?

In search of an answer, I recently gathered some data from the very wonderful chess.com Live Chess players statistics database here - one of the more esoteric benefits of my new-acquired Diamond premium membership. The chart below illustrates the results of the analysis of these data and makes it possible to infer - albeit loosely - some interesting findings:



These data represent an instantaneous snapshot, taken over Dec/16-17/2010, of the Chess.com Blitz ratings for a sampling of Chess.com members in the 1500-1699 rating cohort who also has a claimed USCF OTB (over-the-board) rating - roughly 120+ members. The mean Chess.com Blitz rating of these players was 1592, whilst their mean USCF rating was 1665. This difference of 73 ratings points was statistically significant. The three players whose Chess.com Blitz rating and USCF rating differed by more than 800 points were not included in determining those mean values. On the face of it - though this sampling is quite small compared to membership counts here or the survey counts 'there' - we seem to be seeing a similar phenomenon: online Blitz ratings generally, if only slightly, understate OTB strength.

Another possible inference from these data: Although you can expect on average that your next Chess.com Blitz challenge against a 1500-to-1699 rated player will be with someone roughly comparable to that rating in OTB, the range of strengths of possible opponents is extremely wide - ranging from below 1000 USCF-strength to well above 2000 USCF-strength. See the figure below.



One can easily see that many of the claimed USCF ratings are indeed within the study's range for Chess.com ratings (roughly 1500 to 1700), but even more of them are in the next higher rated bin from 1700 to 1900!

Some disclaimers should be mentioned straightaway. The sample here is relatively small and only covers a 200-point rating range (though this range brackets the usual adult mean USCF rating average). It assumes that the claimed USCF rating is in fact both real and represents current OTB strength. No minimum number of Chess.com Blitz games was required for inclusion.

In closing, it's doubtful that this little study will answer the $64K question for all time or for all members, but it's interesting nonetheless, I believe. It leads me to conclude, on at least a semi-quantitative basis, that there is a reasonable correlation between and not a huge difference between average ratings and strength here in Blitz and those in USCF OTB chess - at least in the heartcut range of 1500-1700 rating - though members' Chess.com ratings may slightly understate USCF-asserted playing strength. The closer one gets to 1200, the more likely the provisional starting rating will bias findings, I suspect. The closer one gets to the top end of the ratings the less data available and the more suspect any conclusions will be, from a sheer insufficient sampling point of view.

Hey, it was fun to do the study, whether it means anything or not. 

