Math People Only!: Changes to how much ratings change... - Chess Forums - Page 8

erik · 2009-10-15T15:48:58-07:00

Ok. There has been talk that ratings move up/down too much. It doesn't bother me, but I know it bugs some people. As some of you know, we use Glicko system (http://en.wikipedia.org/wiki/Glicko_rating_system and http://math.bu.edu/people/mg/research/gdescrip.pdf ). Here is what we use to start:

mathijs

Apr 23, 2010

0

#141

Puchiko, that would probably lead to rating inflation, with people often overstimating their own rating and those extra points pushing up average ratings.

Puchiko

Apr 23, 2010

0

#142

I don't think people have malicious reason to lie. A complete beginner is somewhere around 800-1000 rating strength, and he is likely to click on a "I'm a beginner option"-a lie benefits him in no way.

Of course, people new to the site won't be able to estimate their own rating so we could give verbal options regarding your chess experience.

Serious players could fill in official ratings, and overall, it might be a better system.

Atos

Apr 23, 2010

0

#143

Right and people cheat in all kinds of ways to increase their rating but of course if asked to fill in their rating everyone would be sure to say the truth. Further, despite of any other impression you may get from the forums and from chat, everyone here is able to estimate their playing strength correctly and objectively. And we are living in a perfect world, the best of all possible worlds, and everything always functions exactly as we wanted it to.

Puchiko

Apr 23, 2010

0

#144

My opinion is that most people would have no incentive to lie and would be truthful.

I'm not suggesting a low RD, so the few liars would still find themselves at their real playing strength soon enough.

But of course, I don't have much to back up my views with. Yet neither do you.

I consider it worth a shot, but I might be very wrong.

Atos

Apr 23, 2010

0

#145

Even disregarding the fact that some people would lie about it, most players cannot estimate their playing strength objectively or accurately.

mathijs

Apr 23, 2010

0

#146

I don't think it matters much if people don't know their real rating: as long as they make an unbiased guess, Puchiko's system would be an improvement. However I think there will be some bias to overrating. I agree that I have no evidence to back that up, but Puchiko's claim that people would be an unbiased estimator is much stronger than mine. Only a few people would have to boast for the bias to occur. However, there may be inflation anyway, so, acknowledging that, the question becomes more real. My gut feeling is that the system is unworkable because it is so easy to cheat and to cheat by a lot.

Outside ratings might be used as an unbiased estimator (perhaps with some conversion method based on average ratings on chess.com of players with similar outside ratings), but I'm not sure it's worth the trouble, because it would require people to verifiably identify themselves.

Matthew11

Apr 27, 2010

0

#147

What I know is that not every one who comes here has a rating of 1200, [mine was about 650] so there could be 25-30 test games to see what your start rating is. Say you win a game, the other player was 1100 and you lose a game with a 1300 rated player and draw a game with a 1150 player beet a 1250 player, lose to a 1350 player and draw your last game with a 1150 player.

So, you take your win scores, 1100 ,1250, and average them.=1175

And your lost scores, 1300, 1350, =1325

And your draws, 1150, 1150=1150

so now you average them all, and you rating is 1217

Elroch

Apr 27, 2010

0

#148

We could scrap Glicko, and use any reasonable low pass filter, on a stepped graph the points of which are instantaneous ratings for each game (i.e. opponents rating, +/-400 if it is a win or loss), with the time between games being entirely ignored. The "time" constant of the filter would indicate roughly how many games are being incorporated into the current rating. As a choice of filter, I like a semi-Gaussian.

gwhuebner

Oct 14, 2010

0

#149

As the question arose, whether it is possible to achieve a very high rating by only playing against average rated opponents, I did some tests with the Glicko system:
Suppose a player with an inital rating of 1500 and RD of 200 plays one game every day (no RD decay). Further suppose that he always chooses an opponent with the rating 1500 and RD of 200, too, and wins each game. His rating increases after each game. But after 202 games his rating will saturate at 2183 (if evaluated according to the Glicko system) and will not rise by any further such games. Also his RD will come to a final value of 23,25 after 1905 games. The saturation values depend on the opponent's RD. If in the above example the opponent's RD is always chosen to be 50 (instead of 200), the rating will saturate already at 2097 after 172 games and his RD will be constant at 21,43 after 1771 games played.
This holds, if the rating is rounded to integer and RD is rounded to 2 decimals after each calculation.

The conclusion on the other hand is, that suppose your rating is about 2000 to 2100, you will hardly be able to improve it by winning against 1500 (or below) rated opponents.

ExtraBold

Oct 14, 2010

0

#150

And you probably would have to be that good to win 1905 games in a row.

dave_9990

Nov 16, 2010

0

#151

[COMMENT DELETED]

doomsuckle

Nov 16, 2010

0

#152

Hi,

I might actually play with this some tonight after we freeze a paper for review. In principle, a new player's rating should not heavily affect the opponent's rating (in any system) because the rating could be anywhere on the "rating space."

The interesting part of the rating problem is that the upper bound is a nuisance parameter because we don't know if chess is solvable (for a win) or not. If it is solvable, it's sufficiently complex that the solution is not currently calculable so the scale goes from "the two best players with perfect play will always draw" against "the worst player will lose -- really really fast in a way that will basically always be the infamous 2-move checkmate."

I'd think that the way to really "tune" the starting parameters would be to train a system of N (let's say 2,10,20,100,1000) players where the "true rating" is known, start them all from scratch, and see if the observed rating converges to the "true rating." Of course, since this is Monte Carlo, we need to put in for a few more levels of complexity

1) the case where there is no improvement

2) the case where some players improve and others get worse.

3) the case where some players improve more rapidly than others.

I would expect that if a player is improving relative to the nearest player, the observed rating will somewhat lag the true rating.

Anyway, the provisional rating is just a prior with infinite error that will get smaller (relative to the error on opponents' ratings) over time.

ronarprfct

Sep 27, 2023

0

#153

meniscus wrote:

For those of you who know something about statistics, those last intervals are not confidence intervals, but are called central posterior intervals because the derivation came from a Bayesian analysis of the problem. These numbers are found from the cumulative distribution function of the normal distribution with mean = current rating, and standard deviation = RD. For example, CDF[ N[1600,50], 1550 ] = .159 approximately (that's shorthand Mathematica notation.)

Perhaps you can answer the question I thought of: What is the mathematical justification for having the standard deviation of playing strength change with game density per time? It seems sort of bogus to me. What evidence/reasoning lead to Dr. Glickman's treatment of the standard deviation in this way? I've seen dead links to the derivation of the system, but no live links.

Elroch

Sep 27, 2023

0

#154

You just replied to a post from 2009, so a reply is not guaranteed!

In lieu of that, the purpose of the standard deviation changing is to quantify changing uncertainty in the true underlying rating. Every new result is data that results from the combination of two things - the true strength of the player and random variation in results. This data is being combined with an estimate of rating consisting of two things - the actual estimated rating and the uncertainty in that estimate.

When combining two estimates of a fixed statistic, the degree to which each affects the combined estimate depends on its uncertainty, so the more uncertainty (higher RD) in the existing estimate, the more the new data (with fixed randomness) affects the new estimate. Statistically, as well as a new estimated rating, the combination produces a new lower uncertainty (RD) in the new estimate (because it now incorporates more data).

Note that the reason RD is not allowed to tend to zero over time is that it is also assumed that ratings are not absolutely fixed - they can change over time - so the model needs to be able to respond to that.

ronarprfct

Sep 27, 2023

0

#155

Thank you for your answer!!!

samplayside

Oct 9, 2023

0

#156

Can someone please explain to me as if im a 10 year old, what these values actually mean:
KValue =
CValue =
PValue =
QValue =

Elroch

Oct 10, 2023

0

#157

Provide a link to the source, @samplayside.

BlueHen86

Oct 10, 2023

0

#158

samplayside wrote:

Can someone please explain to me as if im a 10 year old, what these values actually mean:
KValue =
CValue =
PValue =
QValue =

No, you have to be at least 11.

samplayside

Oct 10, 2023

0

#159

Elroch wrote:

Provide a link to the source, @samplayside.

Its literally the OP