The rating of a perfect player

Sort:
Elroch
waffllemaster wrote:

Maybe because I've been up 24 hours but I just don't see the math part.  Why are you using integration and how did you get k = 733?

Add more little steps please


Comes directly from the formula R = k ln(D), substituting R=3000 and D = 20 (both rather rough numbers).

waffllemaster

It seems like you're saying with E x M/N that when a computer calculates less its edge is increased?

Elroch

Well, it takes a smaller multiple in computing power to increase the rating of a weak computer than a strong one. The basic model is that you get a fixed increase in rating by increasing search depth by X%, rather than increasing search depth by X ply.

Unfortunately it suddenly occurred to me while I was heading to the shops that I carelessly assumed the integration constant was zero. It probably isn't. Hence the answer should be ignored until the calculation is redone properly. Wink

maartenderie

what if two perfect players would play each other? would white win? what about the rating changes on that???

---ALL GLORY THE HYPNOPAWN!!!1

Elroch

Most people would assert that if two perfect players played each other, the result would be a draw. I would not be categorical: it is not impossible that the perfect chess game is a win for white, and theoretically possible that it is a win for black (i.e. initial position is zugzwang Smile). The last is extremely unlikely. But no-one knows for sure yet.

Anyhow, my (corrected) model from post #65 gives this equation relating rating to search depth (including the integration constant I left out last time).

R = k*ln(D) + c (where R is rating, D is depth and k and c are constants)

We need two data points to find k and c.

Using examples of Deep Thought (Elo ~2550, exhaustive search depth ~14) and using its estimated 60 point rating increase for a doubling of processor power, together with the estimated 6 times increase in processing power needed for a one ply increase in depth, we find that an increase in search depth from 14 to 15 would cause in increase in rating from 2550 to 2705 (increase of 60 * log(6)/log(2)). So we have two data points (2550, 14) and (2705, 15). Substituting these in the equation:

2550 = k*ln(14) + c

2705 = k*ln(15) + c

Subtracting:

155 = k*(ln(15)-ln(14))

So k = 2247

So c = 2550 - 2247*ln(14) = -3379.

Complete equation is:

R =  2247*ln(D) -3379

For example exhaustive search depth of 20 gives:

R(20) = 3352

This seems quite reasonable.

If this model (which, remember, incorporates diminishing returns for increases in processor speed as the depth gets greater) is correct, a search depth of 200 would give:

 

R(200) = Elo 8526.

 

So this model predicts that the perfect computer has a surprisingly high rating. Can it be anywhere near the truth? Bear in mind that assuming that the increase in rating with processor speed is constant predicts an even higher rating for the perfect computer.

 

Another possibility is that the increase in rating with processing power decreases even more rapidly than this.  This would be so if the short term edge due to an extra ply of search decreases with the original depth. Eg if one computer has a search depth of 5 and another has a search depth of 6,  the second computer might on average gain an advantage of 0.2 pawns in each 5 moves of play, but if one had a search depth of 15 and another had a search depth of 16, the second might on average gain only 0.1 pawns in 15 moves. I am not sure whether this is true or not.

 

These questions can be answered by experiment. If you played a single computer program restricted to different search depths against itself, you could derive an empirical model for the relationship between search depth and rating.

Chessgod123
verticle5 wrote:
@verticle5, with all due respect, your understanding of how the rating system works is flawed in any close to realistic scenario. When a single player enters the pool of players, he/she/it finds its level without significantly altering the ratings of other players.

 That is true, but we aren't looking at realistic scenarios...  except .  If a 'perfect' player wins 100% of his games, and assuming he can play an infinite number of times ( over the course of time this would happen) there would be a gradual shift downwards of EVERYONE else's rating.  The only reason there is currently so much variance is precisely because of imperfection. 

Let's just assume that there are 50 perfect chessplayers in the world, and everyone else is equally horrible, and there is an endless supply of them.  The 50 perfect players would always draw each other, and would always beat everyone else.  The ratings of the endless supply of horrible people would always be gravitating towards 1200, and thus the perfect players would never be able go much further than 2000. 

Basically, my whole hypothetical scenario here relies on the idea that perfection is sooooo much better than 'excellent'.. that anything short of perfect is actually equally bad. Take the example of tic-tac-toe.  Is someone who always loses on the 8th move better than someone who always loses on the 6th move????  Not really, because they are both very bad. In the same way a perfect tic-tac-toe player would hardly distinguish between two people he can easily beat... a 'perfect' chessplayer would be so good that a 2500 and an 1000 rated player are not even distinguishable.  They are both equally bad in that they *should* lose 100% of the time to a "perfect" player.

But...... Bringing tic-tac-toe into the example, (assuming a rating system), we really don't really know how a perfect player would rate.  Am I perfect at tic-tac-toe because I can always make the best move? I guess... but that means I should be drawing every time. 

The question here is flawed because we are not really seeking 'perfection' of the winner as much as trying to understand the likelihood of mistakes of their opponents.  I don't care how great a computer is.. it can not beat me at tic-tac-toe.  Because the mind is limited, no one can say the same about chess.  The question of a forced win in chess doesn't even matter, because if it is possible, then a 'perfect' player could still lose as black. just as a 'perfect' tic-tac-toe player would lose to someone who got to take the first two moves ( a forced win).

So... in conclusion... if chess perfection is attainable, then it is meaningless :)


Although that's a strangely explained argument, it does actually contain logical sense: what verticle is saying (I think) is that the perfect player's rating will be bound by the fact that his opponents won't be able to have ELOs above about 4000 (in the case of the strongest computers - this is their real ELO), so practically speaking it will take too many games for the perfect player to reach his true ELO (e.g. he'll take 10,000 games to reach 5000, because that's how many wins the difference takes - to reach 6000 would take 10^8 games, too many to practically achieved in one life time by far). This also has other complications - the player would theoretically win every game, meaning that if he was theoretically able to play 10^200 games, he would get a rating of 54,000 (vastly inflated).

Basically, what I was doing was calculating a theoretical rating based not on the amount that the player would be able to win but on his true comparative strength based on search depth and skill of moves.

Elroch

It seems to me that the details of how the rating is verified is of far less interest than other issues. You can assume the existence of many players of intermediate strength for determining the rating.

 

I am not that happy with my estimation of the perfect rating, despite the fact that this is considerably lower than would be calculated with the assumption that doubling processor speed increases rating by a fixed amount until no further processing power is useful. Perhaps my guess that the returns from extra processing power diminish even more rapidly is more plausible. Or perhaps our intuition based on existing players and computers gives us no clue of how strong the perfect player would be. Chess is a big, difficult game.

waffllemaster

Interesting elroch, it's certainly a... what to call it, a rigorous guess? :)  8000 is surprisingly high, and a model using search depth certainly seems logical.  One problem might be that current computer elo also has to do with the strength of it's evaluation and (I might be wrong) I think they hide their true search depth?  Because any evaluation without intuition will be worthless on the search horizon (e.g. ignoring a mate in 1 because it's over the horizon) so a computer may calculate out to 30, cut it at 20, and render an evaluation.  That may leave some error in there?

Like you said though someone could run some engine tests to test it out.

Elroch

Yes, I think more hard evidence is needed, in principle not too difficult to get. Especially since 8000 seems ludicrously high. You could run computers against each other with different amounts of time to think, or different search depths, and record the evaluations of all of the positions as well as the results of the games. This data would provide a better founded model of the relationship between the speed, depth and strength.

The horizon effect is always there for computers. This is when something major happens just after the end of a critical branch of the search tree, the result being that the evaluation of that branch (and the original position, if the branch was critical) is completely wrong. Of course computers try to be thorough in branches where there is danger, but they always have to stop somewhere, so the horizon problem never goes away.

One intuitive idea that may be a reason for diminishing returns (and a much lower perfect rating), is that in most positions the first part of the tree has most of the information that influences a decision. When I analyse, usually I'm not trying to get a definitive answer about the position, I'm trying to see which line looks promising because an initiative is developing and/or positional gains being made. Perhaps in some way, deep chess analysis is a bit like predicting the behaviour of a complex system (such as weather), where you can spot a pattern developing (a weather front moving) and from there you can guess it will keep on moving without doing the rest of the simulation. Weather would be a great analogy if it turned out that an analogy of the "noise" in the system, meant that shorter term predictions have most of the information. But there is a fundamental difference that perfect analysis is possible with a chess game, while with weather there are types of noise in the system ("butterfly effect") that mean that long term detailed predictions are fundamentally impossible.

Niven42
waffllemaster wrote: ...Like you said though someone could run some engine tests to test it out..."

 Except we really don't know what perfect play is, yet.  Chess is still unsolved.  You might run Rybka (or such) and have it win every game, but then someone finds a combo that Rybka can't see the end of, and suddenly it has one loss.

DavidMertz1

One thing you have to take into account:  If there was a player known to be "perfect", that player's games would be studied intensely.  His own lines from previous games would be thrown at him repeatedly, until someone found a draw by force (or win, if that's what chess turns out to be.)  He could attempt to mix it up to SOME degree, sure, but there's only 20 possible opening moves.  It wouldn't be long before he'd have to start repeating one move, then another... until one of the "perfect" lines is known far enough that people could just figure out the rest.

Even without that effect, there's a finite number of moves possible, so enough googles of chess-playing monkeys would, eventually, get a draw.  (And actual chessplayers would do even better, since they'd throw out most of the moves that would obviously drop pieces or allow mate in 1.)  So the answer is not infinity.

waffllemaster
DavidMertz1 wrote:

One thing you have to take into account:  If there was a player known to be "perfect", that player's games would be studied intensely.  His own lines from previous games would be thrown at him repeatedly, until someone found a draw by force (or win, if that's what chess turns out to be.)  He could attempt to mix it up to SOME degree, sure, but there's only 20 possible opening moves.  It wouldn't be long before he'd have to start repeating one move, then another... until one of the "perfect" lines is known far enough that people could just figure out the rest.

Even without that effect, there's a finite number of moves possible, so enough googles of chess-playing monkeys would, eventually, get a draw.  (And actual chessplayers would do even better, since they'd throw out most of the moves that would obviously drop pieces or allow mate in 1.)  So the answer is not infinity.


Oh.

Elubas

An interesting question; quite hard to answer. Certainly, the highest levels of human chess today (2800s) are very good, but as humans they are of course flawed all over the place. Being able to see say 25 moves ahead with impeccable accuracy is quite unbelievable compared to just being able to see maybe 10 moves and leave anything else to strong intuition. As good as we are, I think a perfect chess player could be SO much better.

I'm guessing more like 4000-5000 elo, though as a guess, it really could be much larger or smaller than that. Sure it's hard to imagine Gary Kasparov losing to some 4500 player; beating him as handily as a 1700 would a 0, in both cases there being a 1700 difference; but it's all relative! If a person could evaluate variations that well (which is, of course, too magical for any human to ever dream of!), his advantage would be monstrous against any mortal!

(Note that I am talking about pure strength; how well this player would do in some 100 game match for example. I am not saying this rating would necessarily be achieved -- obviously, if nobody can even have a rating of 3000, it will take lots of wins to get to this 4000 mark!)

And as drawish as today's chess has become, I still don't think that means you can survive so easily against someone so ridiculously punishing -- think how a computer punishes errors, times like 20. You don't see "grandmaster draws" so much because the positions are "obviously drawn"; you see them because their results matter and neither guy wants to risk a loss or feels like fighting.

@TheMouse: rofl, good point! Laughing

Brazda
Dear Elroch, do not discard your first model (#65). As usual - when one checks his equations - please, try a few D values, and you will see the absurdity of the second equation (#71). I would suggest D=1 for one. A pretty stupid mashine (or player!), I accept, but definitely deserves positive Elo. So, I believe in your first calculation: the upper 3000-4000 range, and the diminishing advantage of further computer power (as the computer's horizon approaches the huge, but finite variability of the game).
Elroch

Looking back at this discussion after a long pause, I don't feel so reluctant to believe that the rating of a perfect player is indeed extremely high (thousands of points above current best). Our intuition can be very misleading, when based on very limited information. We know that people have ratings below 2900 so far and in recent years find that computers are getting a few hundred points higher. A reason this progress has been fairly slow is the fact that the speed of computers increases so slowly. Recall doubling processor speed only adds 40-60 rating points, and this happens less than once a year. These days, the main computer competitions have limits on the numbers of cores, which slow progress dramatically (computing speed is increasingly about numbers of cores, since the individual cores are reaching the limits of the technology). Regardless of these minor issues, there is no reason to believe that a computer a million times faster than current ones would not get near 100% score against them, and justify a rating heading for 4000. And a computer another million times faster would be how strong? And so on ...

VLaurenT

There is another factor to take into account, I think : chess is very probably a draw on perfect play. And the stronger the players, the highest the % of draws, making it difficult to widen the gap with the competition.

Elroch

hicetnunc, even if chess is a draw with perfect play, this does not mean that a player who is much stronger than another cannot usually get a win, regardless of how strong the second player is (in fact that is virtually the definition of being much stronger). While the percentage of draws does increase with the strength of the players, it is interesting that white's plus score over black also increases at the highest levels of human play. It is not that players at the highest level find it more difficult to get a plus score, it is that it is more a matter of whether they win or give up a draw, rather than whether they win or lose. Hence this fact does not provide evidence of a low threshold on the strength of players, merely that the variance of the results between two very strong players is lower than the variance of the results between two weaker players. Perhaps an example can help make it clear. If you have two strong players, one might win 50% of the time, draw 40% of the time and lose 10% of the time. If you had two weak players with the same difference in strength, one might win 70% of the time, draw 20% of the time and lose 30% of the time.  In both cases the stronger player scores 70% (hence the same difference in rating, we can assume), but there are a lot more draws in the first case.  So the increase in the number of draws with strength does not put a ceiling on the strength of players.

Elubas

2800 players may make a lot more mistakes than we think, even if it seems that they are nearly good enough to, say, always secure a draw with white.

Elubas

"The Elo of a perfect player would never settle, it would just keep going up and up and up. Anyone who doesn't realize this is an idiot."

With your logic, a perfect player's strength is as powerful as infinity.

The problem with your reasoning is that, if you played millions, maybe billions of games, in one sense you would be making what seems to be never ending progress. One problem -- you may draw the odd game somewhere. It seems hard to imagine, again, a 1700 drawing or losing to a 0, yet if a billion or trillion games were played, a draw might still happen, even if it'd be a very incredible occurence. Same as a 2800 managing a few draws against a 4500 now and then -- it would certainly hamper the 4500s progress. Indeed, if we postulate that 4500 is "perfect," then indeed, this would imply that among a huge amount of games, carlsen may get a few draws despite facing perfect play.

In other words, you seem to assume that to be a perfect player, strength wise, means you must be able to win every single time against anyone and everyone you play -- of course, not true. In the case of a perfect player, his results would depend solely on how his opponent plays.

But again, what we, or I, really mean when it comes to strength, is the strength at which the person would consistently perform, which could be determined by many long game matches. For example, scoring xx against carlsen out of 100 games.

fyy0r
uhohspaghettio wrote:

The Elo of a perfect player would never settle, it would just keep going up and up and up. Anyone who doesn't realize this is an idiot.

I agree that at some point the rating would only go up at +0.0000001 for a win, but it is still going up. I reckon it would reach 3300 fairly fast, be much slower to get to 3800 and then after a few million games settle at something like 4200 if it were able to play 2800 players, and it would take millions of games more for it to get to 4201.

Of course if it were able to play 3200 players it would reach a rating about 400 points higher.


Draws would kill a "perfect" player.  Since he is "perfect" we assume he cannot lose, but if any super GM manages to draw with such a player, due to his ELO difference (3400 vs 2800's), his points would even out quite fast.  Take Bobby Fischer for example.  Due to his massive candidates winnings vs Larsen, Taimanov and Petrosian, his FIDE was at 2785.  Spassky was at 2660.  When they played, even though Fischer won, his rating dropped because he didn't win "enough" to offset the rating difference.  The same would be true for a "perfect" player, but in this case, the rating difference is so high that he would be killed by draws.