FIx the rating system

Sort:
Avatar of Tacomeats

Rapid is the #1most played mode. How many times do we need to tell people this. Danny rensch has said numerous times rapid is the most played

Avatar of dpnorman

With the exception of a semi-beginner friend of mine, I don't know of anyone who plays in the rapid pool on this website more than once in a blue moon lol

 

But I imagine people who are just learning chess would most want to play a rapid time control, so it makes some sense if you consider 10min to be rapid

Avatar of CausalityD

Inconceivable.... did someone really complain about the addition of visible ranks in something other then a FPS title... Im in favor of having ranks

Avatar of shady_neighbour
BKPete wrote:

Replying to why I think the rating system is not working - a few reasons in priority order:

1. I play 15 min games rapid and 5 min games in in blitz. My rapid rating is 1200 and stable; my blitz rating is 800 and stable. I don't believe this difference can be explained because of the different nature of the game.

2. If the rating system works but somehow settles to different absolute values, the relative rating should be accurate so I looked at my percentile: in blitz I am 50% in rapid around 90%. Again, I don't believe this difference can be explained by the different style of game.

3. I think the basis of Glicko system is a clever idea - ratings to move more rapidly for unestablished player with an uncertain rating compared to stabilised players. I suspect the system only works if most established players have an accurate rating but we never get to that point because the mass of established platers are playing other established players with inaccurate (near random) ratings in the middles rating bands.

4. I wonder if starting unestablished players with a rating of 1200 (which is around the 90%-ile might be too high & causes the on-going inaccurate ratings for the mass of players. Above ratings of 1200 perhaps the rating system works, but below it appears not. Perhaps unestablished players should start at around the 50%-ile  - i.e. 700 to 800 or lower.

5. I wonder if the system is not generating accurate ratings for mass players because although the ratings of unestablished players varies rapidly initially (because of low confidence) the rating of the established player moves the same amount even when they play an unestablished player (when there is doubt that the rating of their opponent is accurate). It would seem better that both ratings are impacted by lack of confidence in one player's rating. - possible the rating of established players doesn't change after playing an unestablished player. (It would be interesting to test in background to see how stable the ranks of established players differs from the two methods).

5. Anecdotally, I seem to have a different win percentage at different times of the day. Could this be true generally & is it an indicator of the relative number of established players vs unestablished players playing at those times?

6 Checking the histograms of player ratings in global stats - the number of players in blitz and rapid are much higher than the sum of the number of players in each 100 point rating band. What is going on here?  Perhaps just some minor error but is it an indicator of some wider problem in the rating system.

7. I play the chess computer on the site occasionally and find I can reliably beat the computer with a rating of 1500. Have the chess computers been calibrated against people with reliable ratings - if so, is my rating 800, 1200 or 1500. Not a big issue this - but it adds to lack of confidence in the overall system.

8. Has the whole rating system been calibrated to people with reliable ratings (e.g. from club tournament play playing volunteers from chess.com across the rating bands) It might be a good method to establish if the rating system is working and lead to fixes. The ELO system and Glicko system are effectively the same barring this method for adjusting rating of unestablished players. (However some bias will exist - ELO system wonl ever lose of gain points since winners and losers add/deduct the same number of points; Glicko does not guarantee this so some bias will exist especially if unestablished players are given an initial rating different from the average of all players).

9. It would be interesting to experiment with other dimensions of confidence a players rating beyond just the number of games they have played - e.g. subscribers to the site rather than non-paying users; more use of the AI cheat engine etc.

Overall, I have lost confidence in the the Glicko system - just because it sounds a good idea & is an 'industry standard' doesn't mean it works. Chess.com will have vast amounts of data to prove/disprove it works & experiment with variations - testing alternatives before implementing. Glicko has parameters T and C which need to be determined / not pre-defined - have alternative values of these been tested? This would show how sensitive the rank ratings are to different values of these parameters. Is the overall system inherently stable or inherently chaotic - I suspect the latter as things stand.

1. You play 15+10 in rapid and 5+5 or 3+0 in blitz. These time controls are vastly different, and you having a much higher winrate in rapid games shows that your playing strength depends on time control. It should also be noted that ratings measure strength relative to the player base, which is not the same for rapid and blitz.

2. There are no absolute values in the rating system. Everything is measured compared to other players. You assume that the diference in your rapid and blitz ratings is explained by the system not working properly, but it can be explained by your relative skill level being simply higher in rapid. As I mentioned earlier, your overall results in rapid compared to blitz support the latter. Not believing this explanation is not an arguement against it.

3, 4 & 5 I don't understand your assumption about most ratings being innacurate. Your first two points are not a compelling proof of that.

Also on 5. Chesscom doesn't actually use a pure glicko system, where the rating change are always dependent on both players' rating deviations. Don't know why that is, but I agree that it is a strange choice.

6. I would assume that is because chesscom only includes "active" players in the diagramm, but I neither have a concrete proof of that nor do I know how "active" players are determined. Nevertheless, I don't see how this would indicate a problem with the rating system specifically instead of an issue in general (if an issue at all).

7. The bots with predetermined and static rating are not a reliable source of information.

8. While the different change in rating can result in overall amount of rating points in the pool changing, so does a simple introduction of a new player. I agree that this can result in rating inflation over time. However, given the relative nature of ratings, this will only present issues when trying to compare skill level between different periods of time based on ratings, which is already problematic because player base also regularly changes.

9. I fail to understand how having a subsricption to the site increases confidence in one's rating. The rating deviation is a purely statistical measurement and it makes no sense for such arbitrary elements to affect it.

Avatar of AJHopper
Lol hes just sour
Avatar of BKPete

Has anyone seen any studies which validate the Glicko rating system e.g. by comparing on-line ratings with tournament ratings for players who play both?

It appears that the Glicko rating system on chess.com might be working for the higher rated players e.g. above 1200 to 1500 (although would be good to prove this with evidence rather than subjective opinion). But the vast majority of players have a rating in the range 600 to 1100 - does the system work in these ranges?

 

Avatar of BKPete

Just to add: I don't mean comparing the absolute value of the ratings between Glicko and ELO (although there is an argument they should be similar) - its the relative comparison (percentile) which matters.

Avatar of shady_neighbour
BKPete wrote:

Has anyone seen any studies which validate the Glicko rating system e.g. by comparing on-line ratings with tournament ratings for players who play both?

It appears that the Glicko rating system on chess.com might be working for the higher rated players e.g. above 1200 to 1500 (although would be good to prove this with evidence rather than subjective opinion). But the vast majority of players have a rating in the range 600 to 1100 - does the system work in these ranges?

 

Ratings in different systems are not supposed to be comparable. Being 1700 fide classical means nothing more than being 1700 fide.

Avatar of BKPete

We don't need to compare the absolute value of the rating system - just the relative value. We can expect someone who is in the 90, 80, 70... percentile on any chess rating system to be consistent with other rating systems. 

If not, I think we have proof at least one of the system's doesn't work. 

My hunch is the Glicko system on chess.com isn't working - at least for the mid-ranked players i.e. in the 40% to 80%ile .  It would be interesting to prove or dis-prove this.

Avatar of technical_knockout

i'm an underrated 1851 USCF & 1700 average rating here with plenty of room for improvement so i guess it seems ok to me.  🙂

Avatar of kpman25

Hi i think FIDE should give official ratings to chess. Com players in some way.. Like a qualifiers or something

Avatar of catmaster0
BKPete wrote:

We don't need to compare the absolute value of the rating system - just the relative value. We can expect someone who is in the 90, 80, 70... percentile on any chess rating system to be consistent with other rating systems. 

If not, I think we have proof at least one of the system's doesn't work. 

We can't and we wouldn't. 

Avatar of KingPawnSmasher
Blitz has a massive player pool bringing your percentile down. Most GM’s IM’s etc are playing blitz not rapid therefor crushing our percentile egos.

I do agree with starting new players out lower rated. I think everyone should actually start low and have to work their way up.

I started a new account for this to essentially do a rating climb from 400 to figure out where I truly am.
Avatar of Martin_Stahl
BKPete wrote:

We don't need to compare the absolute value of the rating system - just the relative value. We can expect someone who is in the 90, 80, 70... percentile on any chess rating system to be consistent with other rating systems. 

If not, I think we have proof at least one of the system's doesn't work. 

My hunch is the Glicko system on chess.com isn't working - at least for the mid-ranked players i.e. in the 40% to 80%ile .  It would be interesting to prove or dis-prove this.

 

If anything, I would think online pools will likely be more accurate. People generally have many more games played online than they do OTB.

Avatar of shady_neighbour
BKPete wrote:

We don't need to compare the absolute value of the rating system - just the relative value. We can expect someone who is in the 90, 80, 70... percentile on any chess rating system to be consistent with other rating systems. 

If not, I think we have proof at least one of the system's doesn't work. 

My hunch is the Glicko system on chess.com isn't working - at least for the mid-ranked players i.e. in the 40% to 80%ile .  It would be interesting to prove or dis-prove this.

We can't compare rating systems at all. The player bases are different.

You keep saying that glicko doesn't work, but you're yet to prove it. And make no mistake - the burden of proof is on you.

Avatar of BKPete

I think I've already provided good evidence why its not working. I'm surprised there seems to be no studies that have validated the system.

I agree the player bases are different but wholly disagree that such a difference can move someone from a 90%-ile to a 50%-ile. That makes no sense at all.

I suspect your experience (with a rating of 2200) is very different from mine (with a rating of 1200 - max). But this is important: the vast majority of people play around my rating level - the system needs to cater for all, not just the best 0.5% of players.

Avatar of CptObvious

Can only speak for myself, but my ratings on Chess.com have always been a match to my USCF ratings, more less.

And I see many people saying the opposite, but my bullet and blitz ratings have always been much lower than my rapid rating. The slower the time control, the higher my rating. 

 

Avatar of shady_neighbour
BKPete wrote:

I think I've already provided good evidence why its not working. I'm surprised there seems to be no studies that have validated the system.

I agree the player bases are different but wholly disagree that such a difference can move someone from a 90%-ile to a 50%-ile. That makes no sense at all.

Well I don't think you provided enough evidence to justify your claims. You seem to base everything on the difference between your own two ratings (namely blitz and rapid), but you haven't yet proven that the reason behind is not the difference between your skill based on time control. Moreover, you seem to be too focused on percentiles, but it only stands to reason that for someone with an average rating (800 for cc) a reasonable change in rating will cause a great change in percentile, since most players are around that rating level.

Avatar of 4go10_legend
BKPete wrote:

The rating system on Chess.com is a total waste of time. It bears no resemblance to reality. Its a nice idea, but really ruins the site for me. I hope they can fix it. (If the managers of this site reads this post - I'd be very happy to help them out - if they care to contact me.)

It's like your telling a firemen to kill a fire using his fart

Avatar of BKPete
4go10_legend wrote

It's like your telling a firemen to kill a fire using his fart

I see lots of possibilities to fix this - but it would be data driven & that makes it dependent on chess.com since they have the data. I have offered to help and would like to help. I would like to have a rating system that works for all - that seems a great prize for everyone.