The rating of a perfect player

Sort:
Ziryab

Beginners draw often because they do not know how to checkmate.

Elroch

The fraction of draws is not actually the key thing that falls with rating.

The reason is that the Elo rating system is defined by a certain net score corresponding to any rating difference. This net score has to be the same regardless of the rating of the players, so the most important thing determining the number of decisive games is the DIFFERENCE in rating. You MUST have some wins in order to have a rating difference of any size.

The key thing that changes with rating has to be the way the wins are distributed between the stronger and the weaker player. For two club players with a 72 point rating difference, the stronger player scores 60%. For two low rated players this might mean stronger player winning 55% of the time, weaker player winning 35% of the time and 10% draws.  For two superhuman players (i.e. silicon) it might mean stronger player winning 25% of the time, weaker player 5% of the time and 70% draws.

It is a common mistake to see this difference as an indication of approaching "draw death", when it is a different phenomenon.  This should be clear when we observe that this phenomenon was already apparent with players rated around 2700, while perfect play is surely at least 1000 points stronger.   Given that the top engines now get very close to 100% against 2700 rated players, it is clear that those 2700-rated players drawing a lot with each other cannot be indicative of near perfection.

Stil1

If we can assign ratings based on general ability (rather than based on relative Elo, or Elo inflation), then we can make categories and work our way up:

 

2000: Expert

2200: Master

2400: International Master

2500: Grandmaster

2700: Super Grandmaster

2800: World Champion

3000: Modest Engine

3200: Strong Engine

3400: World Champion Engine

3600: "Perfect" Engine

 

(This is all just my own speculation, of course. Others here can surely come up with more scientific numbers and categories.)

llama47
Elroch wrote:

Nothing missing, hicetnunc. This theoretical best player always assumes his opponent will play perfectly.

But 10 years ago he was right. Choosing a move at random, so long as it doesn't change the EGTB eval, would achieve lesser results than knowing which moves will cause an opponent to turn a draw into a loss. Since rating is based on results and not the objective strength of moves, this is a very important consideration in the topic of highest rating possible.

llama47
Elroch wrote:

The fraction of draws is not actually the key thing that falls with rating.

The reason is that the Elo rating system is defined by a certain net score corresponding to any rating difference. This net score has to be the same regardless of the rating of the players, so the most important thing determining the number of decisive games is the DIFFERENCE in rating. You MUST have some wins in order to have a rating difference of any size.

The key thing that changes with rating has to be the way the wins are distributed between the stronger and the weaker player. For two club players with a 72 point rating difference, the stronger player scores 60%. For two low rated players this might mean stronger player winning 55% of the time, weaker player winning 35% of the time and 10% draws.  For two superhuman players (i.e. silicon) it might mean stronger player winning 25% of the time, weaker player 5% of the time and 70% draws.

It is a common mistake to see this difference as an indication of approaching "draw death", when it is a different phenomenon.  This should be clear when we observe that this phenomenon was already apparent with players rated around 2700, while perfect play is surely at least 1000 points stronger.   Given that the top engines now get very close to 100% against 2700 rated players, it is clear that those 2700-rated players drawing a lot with each other cannot be indicative of near perfection.

If in games between peers, proficiency is strongly correlated with draw rate (as I assume it is), it could lead to a "draw death" regardless of how far away humans are from perfect play.

E.g., needing to choose between longer matches which may be impractical (Carlsen - Caruana 2018 did not determine who was better) or faster time controls (which is annoying).

Evanator1109
This was fun to read. Thanks!
Elroch

In the old days, 24 games was the minimum considered reasonable for a world championship, with 48 game matches being used at one time (5 months - gulp. Karpov was said to have lost 10 kg from the ordeal).

The rate of draws was extremely high in the finely contested Karpov-Kasparov matches. In the first after Karpov had accumulated 4 wins, the next 32 games had a sole win. After a dubious end to the match, the 24 game follow-up had 16 draws. Their later 3 matches were quite similar, with a draw rate of 72% across all five matches.

The modern world has far less patience and the world championship has been degraded (as a competition finding the truly best standard time control player).

Elroch
llama47 wrote:

If in games between peers, proficiency is strongly correlated with draw rate (as I assume it is), it could lead to a "draw death" regardless of how far away humans are from perfect play.

E.g., needing to choose between longer matches which may be impractical (Carlsen - Caruana 2018 did not determine who was better) or faster time controls (which is annoying).

Matches are decided by the difference in wins between the sides, not the rate of wins.

This is exactly what a difference in ratings predicts.

The only sort of decisive results a lower draw rate provides is random ones. For example, with two weak players with a small rating difference, each game is almost a coin flip. By contrast with two extremely strong players with a small rating difference, large numbers of draws plus an occasional win for the stronger player makes the result less random.

To state that in a more formal statistical way, the mean score between two players is a monotone function of their rating difference. The standard deviation of that score is a monotone function of the draw rate (or equivalently, the rate of decisive games).

llama47
Elroch wrote:
llama47 wrote:

If in games between peers, proficiency is strongly correlated with draw rate (as I assume it is), it could lead to a "draw death" regardless of how far away humans are from perfect play.

E.g., needing to choose between longer matches which may be impractical (Carlsen - Caruana 2018 did not determine who was better) or faster time controls (which is annoying).

Matches are decided by the difference in wins between the sides, not the rate of wins.

This is exactly what a difference in ratings predicts.

The only sort of decisive results a lower draw rate provides is random ones. For example, with two weak players with a small rating difference, each game is almost a coin flip. By contrast with two extremely strong players with a small rating difference, large numbers of draws plus an occasional win for the stronger player makes the result less random.

To state that in a more formal statistical way, the mean score between two players is a monotone function of their rating difference. The standard deviation of that score is a monotone function of the draw rate (or equivalently, the rate of decisive games).

I'm saying there's a different type of draw death. One is players become perfect (which as you pointed out humans aren't close to) the other is needing an impractically large number of games to determine who is superior.

As for the 24 game match, I recall some top 10 players opining after the 2018 match that 12 games is enough... I don't know whether they know what they're talking about.

JuergenWerner
Stil1 wrote:

If we can assign ratings based on general ability (rather than based on relative Elo, or Elo inflation), then we can make categories and work our way up:

 

2000: Expert

2200: Master

2400: International Master

2500: Grandmaster

2700: Super Grandmaster

2800: World Champion

3000: Modest Engine

3200: Strong Engine

3400: World Champion Engine

3600: "Perfect" Engine

 

(This is all just my own speculation, of course. Others here can surely come up with more scientific numbers and categories.)

 

Can a human get to:

 

3000: Modest Engine

Elroch

Quite a long way off thus far. No-one over 2900. Very few ever over 2800.

Elroch
llama47 wrote:
Elroch wrote:
llama47 wrote:

If in games between peers, proficiency is strongly correlated with draw rate (as I assume it is), it could lead to a "draw death" regardless of how far away humans are from perfect play.

E.g., needing to choose between longer matches which may be impractical (Carlsen - Caruana 2018 did not determine who was better) or faster time controls (which is annoying).

Matches are decided by the difference in wins between the sides, not the rate of wins.

This is exactly what a difference in ratings predicts.

The only sort of decisive results a lower draw rate provides is random ones. For example, with two weak players with a small rating difference, each game is almost a coin flip. By contrast with two extremely strong players with a small rating difference, large numbers of draws plus an occasional win for the stronger player makes the result less random.

To state that in a more formal statistical way, the mean score between two players is a monotone function of their rating difference. The standard deviation of that score is a monotone function of the draw rate (or equivalently, the rate of decisive games).

I'm saying there's a different type of draw death. One is players become perfect (which as you pointed out humans aren't close to) the other is needing an impractically large number of games to determine who is superior.

As for the 24 game match, I recall some top 10 players opining after the 2018 match that 12 games is enough... I don't know whether they know what they're talking about.

You appear to be missing that all you are saying is that the two players are very close in rating. 

The possibility that some people seem to miss is that it is perfectly possible to have player B beating player A 10%, drawing almost all the rest, then later player C beating player B 10%, drawing almost all the rest and so on (or less than 10% if you like). Indeed once the players get strong enough so that the weaker player rarely wins, this is the norm.

llama47

I don't even know what you're disagreeing with me about. I assume we both understand and it's not worth bickering about.

Elroch

Recall you said that there was a type of "draw death" "needing an impractically large number of games to determine who is superior". My point was that this is just two very strong players of similar strength, so a close match of high quality (hence low variance), rather then indicating there is no room for improvement (which is what I think of "draw death" entailing - i.e. there are players who not only draw against each other, but there is no possibility of another coming along and beating them).

It tells us they are (1) strong [because of the low variance of the results] and (2) close in strength [because of the low net number of wins for the stronger player].

Are we in agreement? You have the final word on that!