Why online chess ratings are inflated/deflated

Sort:
watcha

If anyone is interested the formula for the Glicko system invented by Professor Mark E. Glickman is defined here by the author:

http://www.glicko.net/glicko/glicko.pdf

I was really interested whether applying this formula can result in the inflation/deflation of the rating averages on the pool level ( which question is equivalent to asking whether the Glicko system is a zero sum game ).

I have tried to set up a simulation using the constants that are suggested in Glickman's paper ( 1500 for initial rating, 350 for initial RD, 30 for minimum RD etc. ). I made the computer generate some players ( with every player starting at a rating of 1500 but having a randomized 'real' strength between 1200 and 1800 ) and play through a certain number of rating cycles ( rating cycles are important part of Glickman's system - the more time elapses since the last cycle the player was active in the more uncertain his/her rating will be which is reflected in a higher RD ). In every cycle a given number of random games were played. The results of the games were also random but winning probability was adjusted according to the players 'real' strength ( this is necessary in order to provide rating diversity in the pool ).

I put the program together within a few hours so it is very much beta as of now therefore I'm very cautious with my results.

My initial findings are that the average rating on the pool level can deviate from 1500 but only to a small extent. I could get really big deviations ( between 1470+ and 1530+ instead of 1500 on the pool level ) if the number of players were very low ( like 50 ). Even with a couple of thousand players and couple of thousand games played in each cycle the typical pool average would be between 1495 and 1505. With 10000 players and 20000 games played in each cycle it would be between 1499 and 1501. So the Glicko system seems to be practically very close to a zero sum game with a reasonable number of players.

These results strengthen me in my belief that with the vast amount of players chess.com has and with the huge number of games that have been played the deviations in the order of 100 points are very unlikely to be the result of the Glicko system.

DiogenesDue

Ummm...your simulation should be constantly adding new players at 1500 that play some games, lose ratings, then remove themselves from the system.  Otherwise, you're not testing anything close to reality.  It's an open system, not a closed system.  New players come in, lose a lot, and leave.  New players that don't lose a lot tend to stick longer.  Inflation is inevitable.

watcha
btickler wrote:

Ummm...your simulation should be constantly adding new players at 1500 that play some games, lose ratings, then remove themselves from the system.  Otherwise, you're not testing anything close to reality.  It's an open system, not a closed system.  New players come in, lose a lot, and leave.  New players that don't lose a lot tend to stick longer.  Inflation is inevitable.

I appreciate any constructive critique to my approach including yours. My knowledge on the workings of rating systems is very limited and Professor Glickman's paper I've only read this morning. I was just curious what realistic deviation can be expected to arise from the Glicko formulas alone.

I don't know how to get a handle on the realistic chance of players entering and leaving the pool. Could you suggest a realistic formula which in the function of the pool size, the number of played games by a player and the player's rating would tell the probability that a certain player would leave the pool or a new player would enter the pool at the end of some rating cycle.

Idrinkyourhealth

I think the time is important factor - when its allowed more time to think - the difference between 2 equal players can be higher since one of them takes it seriously and spends more time per move than other one, that just moves after 5 min thinking (like me, in the past). 

And in the bliz/bullet the time is limited so both opponents spend equal time thinking moves because its very limited. In online chess u always can use the time factor as an advantage, since some pple can spend more time in this site

DiogenesDue

Unfortunately, I have no idea about the ratio of chess.com accounts that stay vs. leave.  That's the kind of info you would need to do a good simulation.  Please note that because of the ease of creating multiple accounts, it's even messier than OTB ratings.  Anyone can create an account, lose several hundred ratings points, quit, and restart at 1200 again as often as they are willing to set up new Email accounts.  Accounts are created specifically for sandbagging or playing in certain brackets, etc.

Ziggyblitz

Ziryab :- Rating pools may have the same average but they have different populations of players. Another example might be at Gameknot where they have a separate rating system for team games and these tend to be much lower for any individual player, much the same as Chess960 here.

Ziryab
Ziggyblitz wrote:

Ziryab :- Rating pools may have the same average but they have different populations of players. Another example might be at Gameknot where they have a separate rating system for team games and these tend to be much lower for any individual player, much the same as Chess960 here.

There was not a separate rating pool at GK when I played there, but there was more cheating in team games.

L_coolmint

Hello everyone,

Looking at the two charts provided on page one of this discussion thread, I think it is important for us to consider the data from a different perspective.

Currently, the focus has been on an average rating of 1099 vs 1349 in live vs online chess ratings, respectively.

However, both graph seem to show a median score that is only + or - 50 points. And, I further believe believe that this disparity is caused because the live data graph is wider than the online data graph.

Therefore, we would see that the average ratings, if both the graphs were compared in their respective ratios to each other, are in fact very similar.

Ultimately, looking at the face-value of these graphs may lead to the belief that the scores are statistically significantly different. But, with further analysis, both live and online chess ratings will be shown to be accurate.