Forums

The inaccuracies of the ELO rating system?

Sort:
JackieTheCoolMan

I have heard people say that ELO rating starts to lose accuracy under two conditions that I have no idea about.

1) Extremes of ELO. I often see FM's and GM's still competing against each other. I thought, theoretically, if you're 200 ELO points or more above someone, then it really shouldn't even be much of a contest, but 2400-2500s are COMPETING against 2700-2800s. I also saw Nepo and Magnus very close, almost tied, in performance at some very famous thing, round 6 being the most prominent and longest game that is discussed in depth. Magnus was many points more in ELO than Nepo. And also, they say the very worst players should beat each other despite being a few hundred points above the other. This sort of thing does not fly with tennis pros, and I know many tennis players who would easily beat and not even play with people a level below them.

 

2) Youth players. Somehow, their brains are different, and I heard someone say they can win tactically and lose positionally, or something like that, and beat people many hundreds of ELO points above them in poor positions. I don't even know the difference between tactical or positional. Could someone clarify why youth players' ELO ratings are not accurate?

What is the rate of development that makes some GM in their mid-20s lose a hundred ELO points from when they were in their early 20s? Apparently, tennis pros in the earlier times used to retire at age 27. Why are the current best chess players in this world in their late 20s or early 30s? What age do people actually stop becoming the best and lose brain/motor/physical functioning? I read age 24 or so is when reflexes slow down, so that is why I don't understand why we have an age 31 Magnus Carlsen dominating this world. 

3) Tighter time controls. I just don't understand how IMs have a much more even playing field in tighter times with GMs. Even much lower rated has more chances to win. I also saw some YouTube video of Hikaru climbing to 3500 ELO, so that theoretically means he should beat Alphazero, or at least dominate 2800s Magnus. Yet, Magnus beats Hikaru, even in blitz, and no one gives serious conversation about bullet or rapid in discussing who the best chess player is. What is going on with this weird rating system? I'm not seeing too much accurate information about blitz ratings when I look it up. I see 3379 as a bullet record from Magnus, but no one ever discusses Hikaru's 3500??? I would NEVER have known or heard about Hikaru's 3500 ELO if I don't follow him on YouTube, but when I look that impressive ELO up elsewhere, I don't see that info. What am I not understanding?

https://lichess.org/@/Kingscrusher-YouTube/blog/magnus-carlsen-creates-new-all-time-bullet-rating-record-11th-november-2021/kTiFcikR#:~:text=Magnus%20Carlsen%20creates%20new%20all%20time%20Bullet%20rating%20record%20%2D%2011th%20November%202021,-CM%20Kingscrusher%2DYouTube&text=Recently%20Magnus%20Carlsen%20had%20a,all%2Dtime%20Lichess%20bullet%20record!

 

Apparently, GLICKO or some version of it is used by Lichess and is supposed to be more accurate, but I'm sorta understanding that ratings are more inflated? I'm confused.

Jalex13
Why would the age of a person determine whether their elo is correct or incorrect? The elo is used to determine that playing strength of a player.

For example:

A 1300 and 1200 rated player have a game. The game is roughly equal throughout the opening and middle game, but towards the end, the 1300 rated player obtains and advantage of one pawn through a tactical shot, and wins the endgame. Though the difference is 100 points, the game was equal for the most part. It was a minor difference that made the difference between a 1300 and a 1200.

Another example:

A 1700 and 1800 rated player have a game. The 1700 had studied deep opening theory and comes out of the opening and middle game with a slight advantage. The 1800, however has studied endgames and positional play, regaining the losing position, and winning the game eventually.


Positional play is basically understanding what each piece should be doing, and where it belongs. A bishop blocked by its own pawns is considered “bad”. A knight on the 6th rank is considered to be a strong knight. A hole created by a lack of pawns is a weakness. Knowing how to target these weaknesses, and prevent them from happening to you, is positional play.

“I heard someone say they can win tactically and lose positionally, or something like that, and beat people many hundreds of ELO points above them in poor positions.”

That’s pretty much a myth. Younger individuals only tend to have an avebatage in learning chess because their minds are still undergoing development. Children also typically have more time on their hands than adults.

The elo system isn’t inaccurate, but it is inflated compared to Fide ratings. It’s estimated that a 2000 online player is between 1600-1800 OTB.

Hope I answered your question.
Mathieu9229

FiDE rating use the ELO system chess.com does not... So I dont see the point comparing online vs. Otb.

Another thing is the ELO system does not mesure your strength but reflect how you perform against other players. But your rating takes time to adjust so when you are young and improving, your rating may be far from your actual strength. And when you are a retired player, playing a game here and there, your rating takes a long time  to go down.

When it comes to online rating, you can inflate your rating by winning hundreds of games against weaker players. Hikaru is great at that but I bet he would not be as high if he were playing only players from the top 20 (FIDE). OTB you can't really do that because you can't play as many games.

Finally, a player like Magnus is an ultimate outlier, he can't play players his strength and is "stuck"  with his current ELO which as a consequence can't really represent how strong he really is.  

capareloaded

I guess inmho that ELO is just a theory of chess level and is not something accurate at all. It has its uses like ranking players and telling other player what you are sort of capable in a game or match. But like you said is very inaccurate so you can't take it very seriously. From my experience players lower than 800 are begginers and there is no real way to make a ranking from it. From 800 to 1400 is more serious stuff but still very inaccurate. Since the pro players have very different worlds than begginers and club then they are definetely ranked with different criteria. I think its common sense to not use the same rule to rank a master and a noob.

landloch

“I thought, theoretically, if you're 200 ELO points or more above someone, then it really shouldn't even be much of a contest.”
 
A player rated 200 points below another has about a 24% chance of winning a game. Or more accurately, at a 200 point difference, the lower rated player is expected to score about 24% of the time; so over a 10 game match, the lower rated player will on average win 2 and draw 1 OR win 1 and draw 3 OR draw 5.
 
Over many games, yes, a 200 point difference isn’t much of a contest, but for any given game, it’s not shocking for the lower rated player to win.
 
Even at a 400 point difference the lower rated player is expected to score about 9%.
 
“Youth players.”
 
One source of Elo inaccuracies arise when a player is much more skilled than their history of performance indicates. If a person is learning a great deal, but playing infrequently relative to how quickly they learn, their rating will take some time to catch up to their true ability. Enthusiastic youth players often fit these conditions. But the mere fact that a person is young does not mean that a they will by definition be under-rated.
 
“Why are the current best chess players in this world in their late 20s or early 30s? What age do people actually stop becoming the best and lose brain/motor/physical functioning? I read age 24 or so is when reflexes slow down, so that is why I don't understand why we have an age 31 Magnus Carlsen dominating this world.”
 
Assorted studied have shown that for high-level chess players, skill rises rapidly in the early 20s, plateaus around the mid-30s, and begins to decline in the mid-40s. Obviously, cognitive abilities do not change at the same pace as physical abilities. Also, these studies describe what happens to a population on average; the actual rate and timing of increase and decline can be significantly different for any given individual.
 
It’s also worth remembering, that if you’re the best in the world at 30, even when you start declining, you’re still be really freaking good!
 
“Tighter time controls” … bullet, blitz, rapid. The terms bullet, blitz, and rapid describe a fairly wide range of time controls, even within a single term. And there are many settings to play in. Ratings are not comparable across these. For example, getting a 3300 blitz rating in FIDE tournaments against the world’s best players is very, very, very different than getting a 3500 bullet rating by paying against random folks on chess.com at night. Both are insanely impressive, don’t get me wrong, but trying to figure out chess skill by comparing the two doesn't work.
 
“No one gives serious conversation about bullet or rapid in discussing who the best chess player is.”
 
By tradition, people understand “best chess player” to mean best chess player at classical (long) time controls. The idea being that players have the time to deploy their full weight of chess knowledge and ability. Of course, you could then argue that correspondence chess should be an even better measure of chess skill … but I’m not going down that rabbit hole! ;-P

Mermaum

Too many questions, and none of them make much sense. You are too worried about elo when it is nothing more than a way to compare the strength of players in a given pool of players.  Different websites have different ratings. But just because someone's rating is higher doesn't mean they are always going to beat someone with a lower elo.  I have beaten people 400 points higher than me and lost to people 400 points lower.  If I'm distracted and I blunder my queen on accident and my opponent knows what he is doing and knows how to stop all my counterplay there is nothing I can do. But a lot of factors have to be considered such as opening preparation, how the players are feeling that day, a playing style that matches well against another playing style, etc.. All of that should account to some 200-300 elo points. But @landloch 's explanation is the best one and very accurate. Also keep in mind that Blitz is completely different than classical. In classical you find a good move and look for a better one to try and make sure you find the best move. In blitz you don't have time to calculate everything so when you find a good move you just play it, but your opponent may or may not find a better move. 

And the thing with younger players is that they usually -although this is a mild generalization- tend to go for more aggressive sharper positions which allow for the possibility of more tactical shots. But in order to do so, they have to weaken certain squares, or give up the bishop pair earlier, or gambit a pawn, which positionally speaking can be bad. So if they find all the right defensive moves (which is not easy, since attacking is usually easier than defending), once the attack dies down they'll have a better position.

tygxc

#1
"Apparently, GLICKO or some version of it is used by Lichess and is supposed to be more accurate, but I'm sorta understanding that ratings are more inflated?"
Chess.com uses Glicko.
Different sites seed new players differently and the pools differ too, so you can only compare ratings within the same pool, not between different pools.
Glicko is mathematically more precise than Elo as it also estimates the deviation.
Elo was deviced at a time when computer power was expensive and is a simplification to make calculations easier.
https://en.wikipedia.org/wiki/Glicko_rating_system 
https://en.wikipedia.org/wiki/Elo_rating_system 

GMegasDoux

Elo is a self adjusting past outcomes used to predict future outcomes system. It is not all knowing. People are inconsistent. Chess is a brain game and sometimes the connections your brain needs to identify the correct move dont fire properly and people miss things. Everyone has a chance to play the best move available over the board, identifying that move is a varyable process. Younger brains can have a smaller experience pool to draw on and make connections faster, older brains have more bloat experience so it takes longer to get to what you need. They also have more degredation of health over time as they are in maintainance not growth phases.

magipi

First of all, it is not ELO, it is Elo. It's not an acronym, it is named after the guy who invented it. This always annoys me that the inventor gets no respect.

Second, about Elo differences: 200 points of difference means the favorite will score 75%. So yes, it is completely possible that even a longer match will be competitive if the underdog has a good day. Also, you hugely oversetimate the rating difference between Magnus and Nepo, it is much less than 200.

Third, about youth players: they improve very fast, therefore their rating is inaccurate (too low). There is no mystery here.

llama36
JackieTheCoolMan wrote:

What age do people actually stop becoming the best and lose brain/motor/physical functioning? I read age 24 or so is when reflexes slow down, so that is why I don't understand why we have an age 31 Magnus Carlsen dominating this world. 

It used to be said that chess pros peak in their mid 30s. Now it's earlier. The average age of the top 10 has been below 30 for a while.

Especially with technology making information available to everyone, knowledge becomes less important and it's more about energy. It takes a lot of stamina to play at your peak in a tense situation for many hours.

llama36
Jalex13 wrote:

“I heard someone say they can win tactically and lose positionally, or something like that, and beat people many hundreds of ELO points above them in poor positions.”

That’s pretty much a myth. Younger individuals only tend to have an avebatage in learning chess because their minds are still undergoing development. Children also typically have more time on their hands than adults.

It's not a myth at all. For example, one reason Soviet GMs were so impressed with young Fischer is that he played the endgame well. Prodigy kids that are strong tactically are "normal" but a kid who can beat grandmasters in the endgame was astounding (of course Fischer was special, he went on to become world champion, and was one of the best endgame players of all time).