Forums

Chess.com Proposal: How to Handle Growing Bullet Ratings

Sort:
erik

chess ratings are a funny thing. why? because people treat rating like money - to be accumulated and never lost - instead of like indicators and tools to help you find the right opponents. ratings  were designed to be for accuracy and fairness, not accumulation. but, human nature takes over and it becomes about earning, not losing. and there is a well-documented tendency of humans that avoiding loss is more important than achieving gain. very interesting! anyway, after that segway...

chess ratings are also difficult because they are based on so many different factors:

- the number of players in the rating pool
- the average skill of the players in the pool
- the frequency of interaction between players
- the frequency of entrance and exit if new players to the pool

because of all of this, it is extremely difficult to perfectly correlate all chess ratings. uscf is different from fide. one chess site different from another. and on our site, one time control is different from another!

compare correspondence chess to bullet. the players are different. the frequency of interaction is COMPLETELY different. the entrance and exit is different (more "timeouts" in slow correspondence). 

anyway, we're always tweaking the ratings formulas a little to be on par with each other, and to try and stay somewhat close to USCF/FIDE. 

most of the ratings, based on our statistical analysis, are actually in pretty good shape. for some people, their online ratings are higher, and for some a little lower. based on our numbers, correspondence chess, overall, is perhaps 50 points high (and maybe 100-150 higher at the very top - but that is expected because our user pool isn't as strong as the overall pool of FIDE players). 

our blitz rating is right on. and standard is a little low (just so few games played so infrequently). 

but bullet has remained tricky! 

all of this is a long way of saying: bullet is vastly overrated. by 2-300 points. 

and now we have 3 options:

1. leave it. this isn't a good option in my opinion. ratings will continue to bloat and be more and more silly. 

2. slowly bleed them back down and then stabilize with adjusted formulas. 

3. chop them by 200 points and implement new formulas (we would also chop the "highest rating", "average opponent", etc)

 

your thoughts?

trysts

Well, you don't sound like you like #1, because you see it getting silly. So, out of the two remaining choices(and keep in mind, chess makes my maths slower), I'm going to say #3; chop off 200 pts, and introduce a new formula.Smile

TheWayOfTheMate

I agree with number 3 might as well make the solution quick and painless atleast its less painful then watching your rating slowly come down atleast in my case

deepOzzzie

I like the third option.

ivandh

I like the second option because honestly the bullet players who think they're hot stuff deserve to get taken down a notch.

grandmaster56

My question is why we have bullet as a rating at all. 

I see the pleasure in playing bullet games. It's more action, more intensity, and it requires you to test your thinking skills.

With that being said, the games don't prove anything about anyone. There are some people that can actually move quickly and think well about their moves at the same time, but most bullet games are just about how fast you can move pieces around. 

Why not just make the bullet ratings nonexistant? Or, as a half-way mark, have some sort of fun ladder system that is completely separate from the rest of the system, but still fun-incorporated. 

trysts
grandmaster56 wrote:

My question is why we have bullet as a rating at all. 

I see the pleasure in playing bullet games. It's more action, more intensity, and it requires you to test your thinking skills.

With that being said, the games don't prove anything about anyone. There are some people that can actually move quickly and think well about their moves at the same time, but most bullet games are just about how fast you can move pieces around. 

Why not just make the bullet ratings nonexistant? Or, as a half-way mark, have some sort of fun ladder system that is completely separate from the rest of the system, but still fun-incorporated. 


(continuing break-off thread)

Yeah! Bullet isn't real chess!Laughing

frank713

#3 is good. As you said allot depends on who your playing and how often. Those that stop playing basically have a static rating of what it was on that last date played. 3 months later or even later that could well be lower or higher rating. Also as in CXR ratings for OTB games as well as controlled online games, including additional stats to show experience and how often the same players played is a good source of information as well as many other data similiar to sports stats. USCF has added some stats over the last recent years as well, as they too realize a long rating system is not eough. Seasonal, annual, entire career an be more useful in the long run of things. But as each chess server and organization has their own rating sytem, well then each database has their own requirements. I like season ratings as well as that satrts agin after 12 months so that rating inodcates who's been playiny oftem and well in the current season. Regardless of what teh main rating is for entire history of that player. SO having overall and seasona ratings also allows for period (seasonal ratings) that can measure how players have done per year. So this like baseball stats with show players performance per year. If that was added at chess.com, CXR does but does keep the final seasonla rating (well a least its not shown aldo data is there in the database so where. We all know we have good years and bad years. All sports programs have why not chess and other board games, chess, go, checkers.

Best of luck Erik :-)

Hugh_T_Patterson

I have a fairly good understanding of the mathematics behind rating systems, having written a few papers on the mathematics of chess ratings and honestly, the simplest way to do this is to cut ratings points and implement new formulas.

Bleeding them down and and stabilizing them with new or adjusted formulas will end up creating more problems than it solves. The problem has to do with the formulas themselves. Of course, I add this comment not knowing what formulas you have in place! If you're using ELO based formulas the problems abound. For those who don't understand what I'm talking about, here's the quick version:

Your performance as a chess player is based on wins, losses and draws against other players. That means your rating depends on the ratings of other players when considering ratings calculations. Ratings are determined using logistic curves and distribution models.

When you try to alter the formulas with mathematical adjustments, you open up a nearly endless can of numeric worms. In short, its easier to cut the points by 200, which is was a primary number used in making ELO adjustments during the calculations phase and go from there. Thanks, in advance for all your hard work with this endevor Erik. Any choice other than #3 will bring you the gift of perhaps the world's biggest headache.

sapientdust

What is the cause of the discrepancy between bullet ratings and the other ratings?

oinquarki

No matter how much you announce and how clear you make it that it's going to happen, solution three will inevitably spawn a hundred threads saying "zomgwtfbbq gfhgfddghshsjaaAAaaa! you tooked mah rank i quit i hate you all!".

DavidMertz1

I'm in favor of the chopping.  If you do it slowly, the following problems arise:

  • People with older ratings are going to be higher rated than people with newer ones.
  • People are going to see their ratings slowly decline even if they are playing the same strength.  Most people aren't going to read this thread, either, so they'll have no idea why.
  • Stats such as "highest rated win" would not truly show the best person you beat, unless you managed to beat someone with a higher rating even after everyone's rating decreased by a few hundred points.

Of course, this assumes that the ratings inflation applies to all levels.  If it's inflated by 250 at the top ratings but only 50 at the bottom, then chopping off 200 doesn't make sense.

waffllemaster
Fezzik wrote:

From a purely mathematical sense, the chopping method is certainly the cleanest method. 

Still, from a human perspective, a gradual reduction will probably be worth the headaches.


My thoughts too.

oinquarki

Seriously though, method 3 is the best from a mathematical point of view, and the RSS can handle the stuff mentioned in #14 for you.

ivandh
oinquarki wrote:

No matter how much you announce and how clear you make it that it's going to happen, solution three will inevitably spawn a hundred threads saying "zomgwtfbbq gfhgfddghshsjaaAAaaa! you tooked mah rank i quit i hate you all!".


Hence the beautiful deviosity of door number two. The sheer pleasure of it would be worth the effort. Who doesn't hate these jags who compare their 2500 blitz rating to an 1800 standard?

yusuf_prasojo
erik wrote:
1. leave it. this isn't a good option in my opinion. ratings will continue to bloat and be more and more silly. 

2. slowly bleed them back down and then stabilize with adjusted formulas. 

3. chop them by 200 points and implement new formulas (we would also chop the "highest rating", "average opponent", etc).


What will happen if just leave it but apply new formula?

ivandh

No, we must make them think it is their fault! This is the best idea in the history of forever, I am not about to let it go to waste! No matter how much diffucut it is to stay vertically!

oinquarki
LordNazgul wrote:

No, the inflation isn't the same on all rating ranges. 


See post title, word seven.

oinquarki
ivandh wrote:

No, we must make them think it is their fault! This is the best idea in the history of forever, I am not about to let it go to waste!


oinquarki
LordNazgul wrote:

How about this: the seek must include one's own rating. (As in turn-based) This would prevent people from collecting point by point from much weaker opponents on industrial basis, as well as those who lurk waiting to get a shot at a high-rated opponent and run with the points.


This post epitomizes befuddlement.