chess ratings are a funny thing. why? because people treat rating like money - to be accumulated and never lost - instead of like indicators and tools to help you find the right opponents. ratings were designed to be for accuracy and fairness, not accumulation. but, human nature takes over and it becomes about earning, not losing. and there is a well-documented tendency of humans that avoiding loss is more important than achieving gain. very interesting! anyway, after that segway...
chess ratings are also difficult because they are based on so many different factors:
- the number of players in the rating pool- the average skill of the players in the pool- the frequency of interaction between players- the frequency of entrance and exit if new players to the pool
because of all of this, it is extremely difficult to perfectly correlate all chess ratings. uscf is different from fide. one chess site different from another. and on our site, one time control is different from another!
compare correspondence chess to bullet. the players are different. the frequency of interaction is COMPLETELY different. the entrance and exit is different (more "timeouts" in slow correspondence).
anyway, we're always tweaking the ratings formulas a little to be on par with each other, and to try and stay somewhat close to USCF/FIDE.
most of the ratings, based on our statistical analysis, are actually in pretty good shape. for some people, their online ratings are higher, and for some a little lower. based on our numbers, correspondence chess, overall, is perhaps 50 points high (and maybe 100-150 higher at the very top - but that is expected because our user pool isn't as strong as the overall pool of FIDE players).
our blitz rating is right on. and standard is a little low (just so few games played so infrequently).
but bullet has remained tricky!
all of this is a long way of saying: bullet is vastly overrated. by 2-300 points.
and now we have 3 options:
1. leave it. this isn't a good option in my opinion. ratings will continue to bloat and be more and more silly.
2. slowly bleed them back down and then stabilize with adjusted formulas.
3. chop them by 200 points and implement new formulas (we would also chop the "highest rating", "average opponent", etc)
Well, you don't sound like you like #1, because you see it getting silly. So, out of the two remaining choices(and keep in mind, chess makes my maths slower), I'm going to say #3; chop off 200 pts, and introduce a new formula.
I agree with number 3 might as well make the solution quick and painless atleast its less painful then watching your rating slowly come down atleast in my case
I like the third option.
I like the second option because honestly the bullet players who think they're hot stuff deserve to get taken down a notch.
My question is why we have bullet as a rating at all.
I see the pleasure in playing bullet games. It's more action, more intensity, and it requires you to test your thinking skills.
With that being said, the games don't prove anything about anyone. There are some people that can actually move quickly and think well about their moves at the same time, but most bullet games are just about how fast you can move pieces around.
Why not just make the bullet ratings nonexistant? Or, as a half-way mark, have some sort of fun ladder system that is completely separate from the rest of the system, but still fun-incorporated.
(continuing break-off thread)
Yeah! Bullet isn't real chess!
#3 is good. As you said allot depends on who your playing and how often. Those that stop playing basically have a static rating of what it was on that last date played. 3 months later or even later that could well be lower or higher rating. Also as in CXR ratings for OTB games as well as controlled online games, including additional stats to show experience and how often the same players played is a good source of information as well as many other data similiar to sports stats. USCF has added some stats over the last recent years as well, as they too realize a long rating system is not eough. Seasonal, annual, entire career an be more useful in the long run of things. But as each chess server and organization has their own rating sytem, well then each database has their own requirements. I like season ratings as well as that satrts agin after 12 months so that rating inodcates who's been playiny oftem and well in the current season. Regardless of what teh main rating is for entire history of that player. SO having overall and seasona ratings also allows for period (seasonal ratings) that can measure how players have done per year. So this like baseball stats with show players performance per year. If that was added at chess.com, CXR does but does keep the final seasonla rating (well a least its not shown aldo data is there in the database so where. We all know we have good years and bad years. All sports programs have why not chess and other board games, chess, go, checkers.
Best of luck Erik :-)
You say that you guys have the blitz ratings spot on. I have to disagree with this considering my rating on other respected sites such as FICS and ICC. My rating here of 1950 is about 300 points higher than elsewhere. Furthermore, I don't consider myself to be a class A player. On the other hand, I've noticed that the ratings of most players who post their USCF or FIDE on the site correspond quite well to their blitz ratings.
I have a fairly good understanding of the mathematics behind rating systems, having written a few papers on the mathematics of chess ratings and honestly, the simplest way to do this is to cut ratings points and implement new formulas.
Bleeding them down and and stabilizing them with new or adjusted formulas will end up creating more problems than it solves. The problem has to do with the formulas themselves. Of course, I add this comment not knowing what formulas you have in place! If you're using ELO based formulas the problems abound. For those who don't understand what I'm talking about, here's the quick version:
Your performance as a chess player is based on wins, losses and draws against other players. That means your rating depends on the ratings of other players when considering ratings calculations. Ratings are determined using logistic curves and distribution models.
When you try to alter the formulas with mathematical adjustments, you open up a nearly endless can of numeric worms. In short, its easier to cut the points by 200, which is was a primary number used in making ELO adjustments during the calculations phase and go from there. Thanks, in advance for all your hard work with this endevor Erik. Any choice other than #3 will bring you the gift of perhaps the world's biggest headache.
What is the cause of the discrepancy between bullet ratings and the other ratings?
No matter how much you announce and how clear you make it that it's going to happen, solution three will inevitably spawn a hundred threads saying "zomgwtfbbq gfhgfddghshsjaaAAaaa! you tooked mah rank i quit i hate you all!".
I'm in favor of the chopping. If you do it slowly, the following problems arise:
Stats such as "highest rated win" would not truly show the best person you beat, unless you managed to beat someone with a higher rating even after everyone's rating decreased by a few hundred points.
Of course, this assumes that the ratings inflation applies to all levels. If it's inflated by 250 at the top ratings but only 50 at the bottom, then chopping off 200 doesn't make sense.
From a purely mathematical sense, the chopping method is certainly the cleanest method.
Still, from a human perspective, a gradual reduction will probably be worth the headaches.
My thoughts too.
Seriously though, method 3 is the best from a mathematical point of view, and the RSS can handle the stuff mentioned in #14 for you.
Hence the beautiful deviosity of door number two. The sheer pleasure of it would be worth the effort. Who doesn't hate these jags who compare their 2500 blitz rating to an 1800 standard?
3. chop them by 200 points and implement new formulas (we would also chop the "highest rating", "average opponent", etc).
What will happen if just leave it but apply new formula?
No, we must make them think it is their fault! This is the best idea in the history of forever, I am not about to let it go to waste! No matter how much diffucut it is to stay vertically!
No, the inflation isn't the same on all rating ranges.
See post title, word seven.
No, we must make them think it is their fault! This is the best idea in the history of forever, I am not about to let it go to waste!