Math People Only!: Changes to how much ratings change...

Sort:
ichabod801

As to the idea of not rating people for their first X games: how do you rate the games? If I have a rating, and I play a newbie without one, how does my rating change? The only way I can see doing it is the game is unrated for me (and if two new people play each other it's just not rated at all). What's the benefit? You don't lose points to good players without enough games for a solid rating. Points that you will get back anyway and that are minimized by the high RD of the new player. If you're that worried about it, don't play new people.

As to the idea of allowing people to "weight" the ratings by setting odds: bad idea. It will allow people to game the system, and the ratings will lose accuracy with a speed proportional to the lack of understanding about the ratings.

As to erik: I'm confused. What are you using K and p-value for? The only constants in the glicko calculation are q and c.

I agree with ExtraBold that the c value seems fine. By my calculations if a person with an RD of 50 finishes a game a week, their RD will rise maybe 3 points between games. For a person with an RD of 100 the rise would be 0.2. It would seem the only way to mess with it is modify q, but it's not immediately clear what the effect of that would be.

I'm going to look at q closer and post again later.

blackfirestorm
AMcHarg wrote:

I think we should remove starting rating completely and have players 'ungraded' until they have played 10 games!?.  From those 10 games we should calculate their performance rating and that should become their actual rating from which future rating calculations should take place.

 


This rating system is actually in place on another site which I play on and it works great. That way you can assess a players true potential over them (5-10) games. 

blackfirestorm
ichabod801 wrote:

As to the idea of not rating people for their first X games: how do you rate the games? If I have a rating, and I play a newbie without one, how does my rating change? The only way I can see doing it is the game is unrated for me (and if two new people play each other it's just not rated at all). What's the benefit? You don't lose points to good players without enough games for a solid rating. Points that you will get back anyway and that are minimized by the high RD of the new player. If you're that worried about it, don't play new people.

 


Players don't get a choice when playing team matches though. 

ichabod801
blackfirestorm666 wrote:
ichabod801 wrote:

As to the idea of not rating people for their first X games: how do you rate the games? If I have a rating, and I play a newbie without one, how does my rating change? The only way I can see doing it is the game is unrated for me (and if two new people play each other it's just not rated at all). What's the benefit? You don't lose points to good players without enough games for a solid rating. Points that you will get back anyway and that are minimized by the high RD of the new player. If you're that worried about it, don't play new people.

 


Players don't get a choice when playing team matches though. 


 So? They'll still get their points back. It's a self correcting system.

blackfirestorm
ichabod801 wrote:
blackfirestorm666 wrote:
ichabod801 wrote:

As to the idea of not rating people for their first X games: how do you rate the games? If I have a rating, and I play a newbie without one, how does my rating change? The only way I can see doing it is the game is unrated for me (and if two new people play each other it's just not rated at all). What's the benefit? You don't lose points to good players without enough games for a solid rating. Points that you will get back anyway and that are minimized by the high RD of the new player. If you're that worried about it, don't play new people.

 


Players don't get a choice when playing team matches though. 


 So? They'll still get their points back. It's a self correcting system.


Yes but it takes twice as long. 

zankfrappa

     A start RD of 350 with a startRating of 1200 seems too high.  Remember, "the
change is smaller when the player's RD is low".

ghostofmaroczy

I have played slow time control games such as 90+30 at another site.  My RD never gets below 90 because it is so difficult to find opponents and opportunities for such games.  At that other site I cannot shed the provisional label.  I just don't want to be penalized in any way for insisting on slow games.

ichabod801

After some basic fiddling around it appears that messing with the effect of messing with the q depends on the relative RD's of the players. The lower your RD, the less of an effect it has. The lower your opponent's rating, the more of an effect it has. In general, if you raise the q value, the RD's will go down faster. But (and this is a big one) if you raise the q value the ratings change faster. So trying to use q to lower the volatility of the ratings increases the volatility of the ratings.

That's all I can tell just looking at the formulas. I need to dig the original paper up and reread it.

Coach_Valentin

I agree that there's currently a significant deterrent for higher rated players to play in tournaments with lower-rated players, because there's always folks who are lower rated but bound for high ground, and a high RD means that I'd rather avoid facing such people, from which noone really wins.

 

In addition to the question of how to mitigate against ratings that go up by a lot, for new players to the site who are very strong but not yet highly-rated, there's also the opposite question of people who are strong and highly-rated, but for some reason have recently lost many points.  Here's the situation and related suggestion I have

 

Observation: I've had several experiences where I won against strong, high-rated people and would have loved to have been rewarded appropriately for it, except that they had recently lost many points due to losing a bunch of games on time, so based on their rating at the time our game finished I actually was assumed to be playing a weaker player, which is not true at all.

Suggestion: Why not compute the resulting ratings from a given game not based on the latest ratings of those people but rather based on samples from the ratings the opponents had at different times throughout the game?  After all, we didn't play the game in the last day or so, it's often in the span of several months.  The current system assume stability of ratings, which is only true in the very extreme case, and not for many players I've encountered over the past 18 months.

General-Lee

You need to lower the start rating a few hundred points. Has anyone else noticed how inflated the ratings are? I know this would be tough on some good players who just made a new account, but it's simply too high of a provisional rating as most of the time it slips to 900 to 1000 within a few games.

zankfrappa

     Also, c represents a constant.  I am curious how a value of .2 was selected.
In Glicko, the constant is squared, so .2 x .2 =.04, so the number actually gets
smaller (or think of it as 1/5 x 1/5=1/25).  If the c were 2, for example, 2^2 of
course would be 4, a higher number.  I am not sure of how much "range" you
have to work with concerning the constant and some of these other values, I will
research further.

Eternal_Patzer

My Glicko is 140, which is about right, considering the handful of games I play at any given time.  One of my current opponents, however, has nearly 200 games going, over 2000 finished, and his RD is still 49.  That seems a little high, doesn't it? 

kokino

Well, I agree that with the current system, it takes 130 days to go from an RD=50 to a RD100.... (And consequently, my current RD=86, more than 5 years to go back to RD=350...:))

I also agree that the values:

KValue:16 seems to me as for Elo calcullations not glicko???

and PValue: 0,000011, is this a value of time? a second?? (I don't know what you use it here for the glicko calcullations)


 

 

Well, you are using 1 minute for the time increment.... it means that with the current value of c: 0,2 I have to play (better say, finish) a game every less than 3 days to keep at least my current RD, correct?

However, there is something I don't understand, and maybe I am doing wrong the calcullations... I am checking one of the most active players here:

AWARDCHESS, he is finishing more than 10-15 games everyday and his Glicko RD is 46????

Well, if everything were correct, he should be with the minimun RD that the system allows or it is set.

bondiggity
kokino wrote:

Well, I agree that with the current system, it takes 130 days to go from an RD=50 to a RD100.... (And consequently, my current RD=86, more than 5 years to go back to RD=350...:))

I also agree that the values:

KValue:16 seems to me as for Elo calcullations not glicko???

and PValue: 0,000011, is this a value of time? a second?? (I don't know what you use it here for the glicko calcullations)


 

 

Well, you are using 1 minute for the time increment.... it means that with the current value of c: 0,2 I have to play (better say, finish) a game every less than 3 days to keep at least my current RD, correct?

However, there is something I don't understand, and maybe I am doing wrong the calcullations... I am checking one of the most active players here:

AWARDCHESS, he is finishing more than 10-15 games everyday and his Glicko RD is 46????

Well, if everything were correct, he should be with the minimun RD that the system allows or it is set.


Yes, I think that a more realistic t-value, and an adjusted c to correspond to this new t would be a way to keep RD's more constant. 

amitprabhale

Thankx Erik 4da information

kurtgodden

I like meniscus' suggestion of "half rated" games when playing thematic openings and such, when you are not beginning from the standard starting position.  (I dislike his other suggestion of a toggle between Elo and Glicko.)

But assuming the implementation of the Glicko system is correct, even though I have also found some of my ratings fluctuations to be mildly irritating, if you end up changing it, you will just trade one set of irritations for another.  That's just how people are.  Don't even try to satisfy everyone.

Returning to the implementation, I applaud Erik for posting the initial parameter settings here because transparency can only be a good thing.  I would only add that the only way to "know" if the implementation is correct would be to similarly make public the code that implements Glicko on your site.  That way all the math and coding geeks can have at it to their heart's content, and if there is an error, believe me, they *will* find it !  Laughing

deepOzzzie
_valentin_ wrote:

I agree that there's currently a significant deterrent for higher rated players to play in tournaments with lower-rated players, because there's always folks who are lower rated but bound for high ground, and a high RD means that I'd rather avoid facing such people, from which noone really wins.

 

In addition to the question of how to mitigate against ratings that go up by a lot, for new players to the site who are very strong but not yet highly-rated, there's also the opposite question of people who are strong and highly-rated, but for some reason have recently lost many points.  Here's the situation and related suggestion I have

 

Observation: I've had several experiences where I won against strong, high-rated people and would have loved to have been rewarded appropriately for it, except that they had recently lost many points due to losing a bunch of games on time, so based on their rating at the time our game finished I actually was assumed to be playing a weaker player, which is not true at all.

Suggestion: Why not compute the resulting ratings from a given game not based on the latest ratings of those people but rather based on samples from the ratings the opponents had at different times throughout the game?  After all, we didn't play the game in the last day or so, it's often in the span of several months.  The current system assume stability of ratings, which is only true in the very extreme case, and not for many players I've encountered over the past 18 months.


Um in your last statement you said you would prefer games to be rated on a game by game basis. This would be fine, however you would still have the problem with RD's with new players. So it is not possible.

As for the glico system; it is a very good system and based on the figures you have provided i cannot see any real reason why there would be large flucuations in rating, other then if the individual took a long break from the game.

JollyPlayer
erik wrote:

FWIW, i think that a target RD for someone who plays a LOT of chess on Chess.com should be 30-50. i play quite a bit and my RD=81.


This is a ridiculously high number since Dr. Glickman suggests it be used like a standard deviation.  The high number comes from using minutes for the time interval.  Glickman in his original paper suggest 2 month intervals as a unit.  Such a high RD is quirk in Glickman's system.  He allows the user to pick the c value and the t value.  He also suggests that a flooring of an RD of 30 should be done.  That still makes a confidence interval of 120 points.  Hard to believe anyone who plays a lot of chess that their rating would vary that much.

Erik plays a lot.  Do you really believe his rating varies by 360 points?

And I disagree with ExtraBold.  Provisional ratings keeps people from feeding points into their rating on newbies.   Newbies who lose to higher rated players can actually have their score go UP.  

I have a personal example.  I tried to get my son to play on Chess.com.   He lost his first game and his rating feel horribly.  He never wanted to play again.

The Elo system is simple and elegant.  Glickman puts on his page at Boston University that he wants to be the "John Travolta of Statistics".  Yeah, right.  What does THAT mean.  Dr. Elo was a good chess player, Minnesota state champ several years.  He understood.  Dr. Glickman, on the other hand, stated his system need more work in 2002 and we have had silence.

He is a professor quality control in health care.  He wrote a chess paper and started a controversy.  Well, he did succeed in getting his name out.

There is no way Chess.com can change from the Glicko system.  The best the can hope to do is tweek it to be more fair.  There are players ranked 2900+ (Kasparov/Fischer like numbers) who are not masters or close to it OTB.

Elo has tons of advantages over Glicko.  Glicko assumes that time makes you a worse player.  Want your score to go up quickly?  Player a lower rated player every few days or weeks and in between play the computer or study.  Your score will zoom upwards.

grandmaster56

Why don't we do a USCF rating system?

We can then use the idea of playing x number of games before your rating is determined, then plug it into the formula:

\begin{displaymath} R_\post = \frac{N R_\pre + m R_\avg + (W - L) 400}{N+m} \end{displaymath}

then Rpre is the current rating that they have on chess.com, and N is number of games that made this rating possible. Ravg is the average rating of all the person's opponents, W is wins, L is losses, and m is number of games, which in this case would be what "x" would be from the predetermined number of games before the rating is determined. Answer is rounded to nearest integer. So then( taken from this website that I'm looking at):

 

Example: Suppose a player rated 1500 based on 6 games competes against players rated 1400, 1550 and 1650, winning the first, losing the second and drawing the third. In this case, $R_\avg = (1400+1550+1650)/3 = 1533.33$$m = 3$$N = 6$$W = 1$$L = 1$, and $R_\pre=1500$. Then, from the special rating formula,

 

\begin{displaymath} R_\post = \frac{6(1500) + 3(1533.33) + (1 - 1)400}{6 + 3} = 1511 \end{displaymath}

 

 

The final result has been rounded from 1511.111.

well instead of typing the rest of it out here, here's the website:

http://math.bu.edu/people/mg/ratings/approx/approx.html

then x=predetermined number of games=m

I'm sure that there's a counter on chess.com that knows everyone's game count to find N.

and then for the new members of chess.com, N=0 and Rpre=0.

 

There is probably some way to not use some of these formulas that are on the website, but even so it's very easy to put in the formulas into computer code.

edit: didn't realize that the formulas would be blanked out like that.

woton

It doesn't matter what rating system you use, or how you tweak the present rating system, people will still complain.  Although a rating is a statistical value representing relative playing skill, chess players as a group are sensitive to changes in their rating, and they usually blame the rating system.