How the new FIDE rating system is flawed

Oct 6, 2014, 8:24 AM 9

This blogpost is about some of the unintended consequences of the new FIDE rating rules, effective from July 2014.

The main change is that players are awarded an rating after only five rated opponents. This is too few, since it makes your initial rating too random. Together with many other details of the rating system, it can be hard to correct a too low rating.

As an experiment I analyzed the recent 2014 Hong Kong International Open, and looked the impact of the new rating rules to some of the unrated players.

So what's the problem with using five games against rated opponents? 

For instance this poor guy. He will be assigned a ~1160 FIDE rating (for an actual ~1530 performance!). That's 400 points wrong! The reason for this kind of errors is the way performance ratings are calculated, and that FIDE assigns these initial ratings after too few games.

Likewise, there must be players who will be assigned a too high initial rating (this guy seems to be getting a 1660 rating for a ~1400 actual performance). 

So what's the problem with getting a too low initial rating? 

Problem getting a way too low rating is that you are not allowed to enter tournaments. Many FIDE rated events have a rating floor, for instance the only FIDE rated tournament in Taiwan  (Asian Dragons)  has a 1500 (or is it 1600?) rating floor. Also some national events can require a certain FIDE rating to participate (like here in Taiwan the "King of Chess" has a floor of 1400 FIDE). Some other countries use FIDE rating to select national teams etc. 

But can't he just play a few tournament's and get his rating up? 

In theory, yes. In practice, not so. There are not so many tournaments to begin with. Here in Asia we do not have six weeks of vacation (rather, here in Taiwan we have seven days per year). So not so easy to travel to chess events.

Also, would our 1160 hero want to get his rating up -- the next problem is the FIDE rating system capping rating differences at 400 points. This means that if our 1160 player plays a 2600 GM, he would be punished as if he lost to a 1500 player, even if his chances against the 2600 are 0. Most open FIDE events have a large portion of 2000+ players. For example, in the Hong Kong tournament the average rating was 1961. So even if you attend an event, there is no guarantee you get any rated games where you have a realistic chance of winning (I don't consider beating a 500+ higher player a "realistic chance"). Instead you spend your vacation on one event, where you lose 5 of 9 games against 2000+ players - losing some 15-20 pts, and then playing unrated players the rest.

Example of this is this guy.

 It's a 1500 rated player that lost 20 points while having a 1700 actual performance. Our 1160 initial rating guy could have had the exact same tournament run as this 1500 player -- and still lost almost the same 20 points, down to 1140s. Adding 400pts to his rating or strength would have made no difference.

Like this is not enough, FIDE makes it even worse to correct your rating. For adult players (our 1160 guy is 28) FIDE also drops the K factor (a coefficient determining how much your rating changes in each game) after only 30 games. I think 30 games with high variability is too few to compensate for a, say, 400 points misrating. Not to mention how many years it would take to get enough games to do so even with K=40.

EDIT: It appears that you get 30 games with K=40, rated or unrated, before your K value drops to 20 (or lower). So in theory you can play all your high rating variability games against unrated opponents.

While these are just a few examples I found in one recent event. I am sure this phenomenon is commonplace, and has to be addressed. Now it is : get a good initial rating, or potentially suffer for years.

But is there a better way? 

Yes. But that is a topic for another post :-)

Basically, one can calculate performance ratings for a tournament from first principles, that does not assume any one player's rating is correct. Instead it assigns ratings based on actual match results -- even games against unrated players can be taken into account. 

Below you can see what I believe is a more accurate rating performance list for the Hong Kong Open International 2014 (if anyone knows how to import a chess.com blog table from a text file -- or even get a verbatim/code environment, tell me!):


