Math People Only!: Changes to how much ratings change...

Sort:
Avatar of dsji

http://math.bu.edu/people/mg/ratings/rating.system.pdf

this is the glicko system as implemented by the USCF.

We may want to remove section 3.1 regarding match play but I think the restrictions on playing +/400 points of current rating are doable. Since tables of expected change are available in the public domain the changes that Glicko-2 made ( a volitility index that can be used to measure confidence of the RD and expected rate of change) can be implemented easily.

set the rating period to an average of thirty  games. After thirty games convert the ratings to Glicko-2 ( This can be used as a way to spot potential suspicious players for further scrutiny as they become outliers before they become outrageously high rated ). As well as allowing for the adjustment over time of ratings in a more evenhanded fashion as opposed to allowing eariler performanced to have a greater effect.

http://math.bu.edu/people/mg/glicko/glicko2.doc/example.html

this is glicko-2

Avatar of Kacparov

I don't

Avatar of Baseballfan
phillliesarethebest wrote:

it makes sense in sports for teams like baseball for example you have a team record and you have a winning precentage they equal well i think that should be the same for chess


The assumption with sports teams is that teams in the same league have roughly the same talent. You don't have 25 guys straight out of HS playing 25 seasoned veterans. In chess, you can have players who are at completely different levels playing eachother. If my rating is 1200, and I beat someone who is rated at 2000, certainly that means more than if I beat someone rated 800, and a rating system like this one tries to reflect those differences.

Avatar of ichabod801
jay wrote:

Alright, I have put the minimum K value back in, even though Mark Glickman had no idea why that was there, although fics does use it, and apparently he helped them implement their formulas. Without the min K value, people with low RDs, their ratings just dont move at all (like the computers in live chess.) I've also created a calculator you can use to test various scenarios. Please math people get in there and run some tests and let me know if the output looks correct. It certainly doesn't feel correct at times.


Hey, Jay, sorry I took so long to check this out. I did some quick tests this morning, just 6 random tests, 3 of them constrained to give close rated players with low RDs. I ran the calculations through my Python version of your code and the web calculator, and the calculations check out. Subjectively, the changes look fine to me as well, but the real test is how they play out over time.

Avatar of zankfrappa

     I know you all are busy, but I wanted to mention something about my Tactics
Trainer rating.  I went 14-10 today and my rating still went down!!!
     I have now lost about 600 points since October 1st, down from 2295.  I lose about 16 points for a miss and gain about 8 points for a correct answer.
     It seems we have gone from too drastic a change per problem to too little,
perhaps there is a happy medium.
     Thank you.

Avatar of Rookbuster

This might sound like a dumb question, but if your Rd is lower does that mean that you are actually closer to your rating that is shown? For example i'm currently rated 1638 with a RD of 56 that would mean my rating strenght is between 1526 and 1750 but a friend has a rating of nearly the same with a Rd of 79. 

Avatar of ExtraBold

Rookbuster, the RD is like a standard deviation, so your "true" rating would have a 95% probability of being within twice the RD of your rating, in your case between 1526 and 1750.

 

Staff, going forward, is the data accessible (to you) to see whether, say, the average score against players, say 100 or 200 points higher, across a large sample of games is similar to the expected score for such games predicted by the Glicko formulae. This could be give some feedback on how good our sigma is, and be a cheap and cheerful alternative to the intensive data analysis that might be used to estimate sigma.

Avatar of Atos
Rookbuster wrote:

This might sound like a dumb question, but if your Rd is lower does that mean that you are actually closer to your rating that is shown? For example i'm currently rated 1638 with a RD of 56 that would mean my rating strenght is between 1526 and 1750 but a friend has a rating of nearly the same with a Rd of 79.


Your rating is what it is, it exactly 1638. The RD is supposed to indicate the degree of uncertainty about your real "playing strength." The questions that could arise here are eg. whether uncertainty can be measured with mathematical precision, whether ratings can ever be an accurate measurement of playing strength (which is also not fixed but can vary greatly from day to day) and whether incalculating factors other than ratings themselves will help to measure the playing strength more accurately.

Avatar of cookie3

why is the starting # for players set at 1200?  in other article, it was stated that the average rating @ chess.com was just over 1300; shouldn't this then be the starting #?  Also,  there are a great many people who will give up on the game when their own continued growth requires work; so why not set the beginning # of games to higher amount?  Instead of 10 games, maybe raise that number to say 100 games.  This way, beginners dont affect exp. players as greatly, would give beginners ratings more accuracy, and, possibly lower the R.D. #.  Anyways, thanks to Chess.com!  All is greatly appreciated!

Avatar of Muhammad333

I think that http://www.chess.com/ should use ELO ratings instead of Glicko.

Avatar of catholicbatman
shadowc wrote:

I agree on NOT computing a rating on new members until several games are played, so if I play a new member which is bound to be 2100 rating in the future and he beats me, I'm not gonna loose 300 points at him if he's still 1200 when I play.

As for other math, I'm not specialist.


Good point there, I know what you mean losing to someone who is bound to be super good.

Avatar of ExtraBold

Glicko covers that. If you play a new member they have a high RD, so your rating changes only a little.

Avatar of pdela

Maybe for online chess rating should start in 1400. 

Avatar of mathijs

Vance, it's an interesting idea, but it suspends the notion of a ratng system as a system of assessing playing strength. I know most people don't think of ratings that way anymore, but that's what they're set up to do; what they're designed for.

Avatar of mathijs

The fact that unrated games can be construed as on an extreme of your suggested continuum is not really relevant. The rating system you suggest is not an assessment of playing strength, whereas the elo and glicko systems (and the like) are. The unrated games are just not part of those systems.

Avatar of Puchiko

I remember that when playing Go on KGS, the site would have you set up your initial approximate rating. Of course, if you lied and set up a high one, you'd quickly be shot down by the high rated players who wanted to play with you, and overall, the system was quite accurate.

Starting everyone out on 1200 doesn't really help accuracy-a player of 1800 strength will take some time to get to that rating, especially in live chess, where it's hard for such people to face strong opponents.

Avatar of mathijs

Puchiko, that would probably lead to rating inflation, with people often overstimating their own rating and those extra points pushing up average ratings.

Avatar of Puchiko

I don't think people have malicious reason to lie. A complete beginner is somewhere around 800-1000 rating strength, and he is likely to click on a "I'm a beginner option"-a lie benefits him in no way.

Of course, people new to the site won't be able to estimate their own rating so we could give verbal options regarding your chess experience.

Serious players could fill in official ratings, and overall, it might be a better system.

Avatar of Atos

Right and people cheat in all kinds of ways to increase their rating but of course if asked to fill in their rating everyone would be sure to say the truth. Further, despite of any other impression you may get from the forums and from chat, everyone here is able to estimate their playing strength correctly and objectively. And we are living in a perfect world, the best of all possible worlds, and everything always functions exactly as we wanted it to.

Avatar of Puchiko

My opinion is that most people would have no incentive to lie and would be truthful.

I'm not suggesting a low RD, so the few liars would still find themselves at their real playing strength soon enough.

But of course, I don't have much to back up my views with. Yet neither do you.

I consider it worth a shot, but I might be very wrong.