Chess will never be solved, here's why

Sort:
Avatar of llama51
tygxc wrote:

#2106
"why error rate is a linear function of time"
++ I did not assume a linear dependence.
On the contrary I assumed logarithmic dependence:
at 1 s/move:      1 error/679 positions
at 1 min/move: 1 error / 3478 positions
at 1 h/move:      1 error / 3478 * 3478 / 679 positions = 1 error / 17815 positions
at 60 h/move:    1 error / 17815 * 17815 / 3479 positions = 1 error / 91251 positions

 "this could only be a minimum error rate "
++ As the error rate is that low, the occurence of two or more errors can be neglected.
P(2 errors) = P(2 errors|1 error)*P(1error) ~= P(1 error)^2 << P(1 error)

"this could only be a relative error rate"
++ By the generally accepted hypothesis that chess is a draw each decisive game must contain at least 1 absolute error: a move that turns a drawn position into a lost one

Eh, I don't know why I said that. Yeah, the rule you used is when the input is x60 the output is x5... that's obviously not linear... but I still don't know the logic for why that works other than you had 2 data points and are just playing with ratios.

 

As for draws, oh I see, you're saying draws that happen after zero errors are much more likely than draws that happen after 2, so we can just ignore games with multiple errors.

Eh... isn't this making a lot of assumptions? For example let's say the error rate is not in the single digits for either engine, and the winner is routinely committing multiple fewer errors. Why is this scenario unlikely? Current SF is something like 300 points stronger (link below) than the one that played AZ (SF8 played AZ). Is it really sensible that the error rate was so low 5 years ago when today it doesn't require 100s of games to determine which engine is best for e.g. CCRL 40/40?

https://www.chessprogramming.org/images/0/04/SfElo.png
https://ccrl.chessdom.com/ccrl/4040/

 

Avatar of tygxc

#2109

"you had 2 data points"
++ Yes, I only have 2 and try to make best use of those.

"let's say the error rate is not in the single digits for either engine"
++ Well, the error rate is that low and gets lower with more time

"today it doesn't require 100s of games to determine which engine is best"
In the TCEC superfinals they have to impose slightly unbalanced openings to avoid all draws
They also give all information about evaluation, nodes/s, 7-men endgame table base hits
https://tcec-chess.com/ 

Avatar of MARattigan
tygxc wrote:

#2106
"why error rate is a linear function of time"
++ I did not assume a linear dependence.
On the contrary I assumed logarithmic dependence:
at 1 s / move:      1 error / 679 positions
at 1 min / move:  1 error / 3478 positions
at 1 h / move:      1 error / 3478 * 3478 / 679 positions = 1 error / 17815 positions
at 60 h / move:    1 error / 17815 * 17815 / 3479 positions = 1 error / 91251 positions

 "this could only be a minimum error rate "
++ As the error rate is that low, the occurence of two or more errors can be neglected.
P(2 errors) = P(2 errors|1 error)*P(1error) ~= P(1 error)^2 << P(1 error)

"this could only be a relative error rate"
++ By the generally accepted hypothesis that chess is a draw each decisive game must contain at least 1 absolute error: a move that turns a drawn position into a lost one

As I pointed out here, SF14's recommendations after 50 moves in any phase will be purely random because its evaluation of the position after any move will be exactly 0.00 (unless the move happens to be mate).

So the error rate will then be (moves available - perfect moves available) / (moves available), where perfect moves relates to the perfect moves in your no 50 move rule game.

Apart from not proving your generally accepted hypothesis (part of the point of the exercise) you have taken "at least 1" to mean "at most 1" (not to mention what you do with it after that).

It should tell you something that trying it out in a situation where the error rate can be accurately determined (as I did here) gives an average error rate of 1 error / 32 positions with a general tendency to increase with think time.  Rather a far cry from 1 error / 91251 positions.

Avatar of Elroch
tygxc wrote:

#2109

"you had 2 data points"
++ Yes, I only have 2 and try to make best use of those.

grin.png

You seem to miss the fact that the best you can do with inadequate data is to fail to draw a conclusion. More "trying" doesn't change that.

Avatar of haiaku
llama51 wrote:

Eh... isn't this making a lot of assumptions?

"A lot" really underestimates the number of assumptions @tygxc makes to support his claims. happy.png

Avatar of Elroch

There is another fact which makes all this nonsense invalid. It is that (especially if the true value of chess is a draw) real world chess strength (measured by Elo) is not merely about playing accurately according to absolute chess truth (a 32-piece tablebase). This would suffice in winning positions, but in drawing positions it is about giving an imperfect opponent the opportunity to make mistakes.

To demonstrate this, observe that to a tablebase there are no preferred drawing moves in a drawing position. These are all correct.

So consider the following strategy for a player with a 32 piece tablebase - in a theoretically drawing position, including maybe the starting position, pick a move which is tablebase drawing but which has the lowest evaluation according to some engine. Now play a weak player and tell the player that he should play cautiously, avoiding leaving pieces hanging and to keep his king safe.

The result is likely to be a draw, the weak player being able to find a move that doesn't turn a more active drawing position into a losing one.

The relevance to the consideration of error rates is that the error rate is highly dependent on the opponent rather than being fixed for a player. A strong player gives a weak player opportunities to make mistakes rather than merely preserving a theoretical draw.

If you have never played a 4000-rated player, you can't assume your error rate against it will be the same as against a 3300-rated player.  And no-one has played a 4000-rated player.

Avatar of haiaku

I agree. The WDL percentages that an evaluation function returns do not measure the probabilities that the theoretic value of the game is a win rather than a draw or a loss, imo. They measure how difficult it is for a player to handle the position in practice. I don't think we can infer the value of the game from that.

Avatar of Elroch

A good example of this is that the Berlin defense has been viewed to be an inpenetrable dead draw at the highest level of human chess, producing games that likely never have one side reaching a winning position. @tygxc might guess an error rate near zero in these games, and he might be right. 

Yet if a world champion plays the Berlin defense against Stockfish 14 and tries for a draw, he is going to lose the large majority of games.

Avatar of haiaku
tygxc wrote:

As the error rate is that low, the occurence of two or more errors can be neglected.
P(2 errors) = P(2 errors|1 error)*P(1error) ~= P(1 error)^2 << P(1 error)

What is that, conditional probability? You mean:

P(2nd error | 1st error) = P(1st error and 2nd error)/P(1st error) and thus

P(1st error and 2nd error) = P(2nd error | 1st error) * P(1st error)?

But if you consider the errors statistically independent and having same probability, no need to use conditional probability:

P(1st error and 2nd error) = P(2nd error) * P(1st error) = P²(1st error). On the other hand, if errors are not statistically independent, how can you say that P(2nd error | 1st error) ~= P(1st error)? I don't understand exactly.

tygxc wrote:

At 1 s/move: 88.2% draw = 11.8% error / game = 1 error / 679 positions
At 1 min/move: 97.7% draw = 2.3% error / game = 1 error / 3478 positions

I infer that by extrapolation from the peer reviewed Alpha Zero paper. I know the published error rate at 1 s/move and 1 min/move [ . . . ]

Can you please post a reference to the exact paper, page and line, where you read those error rates?

Avatar of playerafar


Again regarding the forum topic - an idea -
Which is not what 'solved' means or should mean -
but what it could mean.  Multiple possibilities.
Some of those possibilities have concrete answers.

     Will any human ever reach a point where somebody shows him or her a position and then without any computer assistance the person could instantly know and say with total accuracy whether the position is a win and for whom or - its a draw?  

The answer is No. 
There's under 32 million seconds in a year and under ten billion seconds in a human lifetime.  
Even Capablanca wouldn't be up to it ! 
Unless you could reincarnate him and then make him immortal.
Then - it would take him some time and would he be interested?  happy.png

So - the opening post refers to computers then.  
But then there's still lots of possibilities of what the forum title could refer to.
1) It could refer to 'any chess position'. 
2) Or - to the opening position and its 20 options for White.

1) Could a point be reached where any chess position is shown to a computer and it can instantly pronounce whether a win is available or not and to which player?
2) Is that completely equivalent to - the 20 possible opening moves for white would each have been determined by computers to be either wins or losses or no win available to either player ?

Are 1) and 2) above equivalent to each other ?  Maybe. 
But that would mean the computers would also have to have taken care of positions that may or might not be legally possible to have arisen from the opening position.  In order for the two statements to be equivalent.

But many people might prefer one or the other.
Or might argue that if either of them is established - then the other one is too.

But another interpretation is that you couldn't have (2) without (1).
At least one of the situations needing the other situation.
That looks much more to the point.

So now - next prototype of things the forum topic might refer to intentionally or unintentionally:

Prototype:
Chess is 'solved' by computers if a situation is reached in which if any chess position is shown to the computers and they could and would then instantly and accurately indicate whether a forced win is available from that position to either of the two players and to which of the two players that win is available to and also qualify in each of those cases as to whether such win takes more or less than 50 moves with no captures nor pawn moves or not - and have proof available - and they could and would also indicate instantly and accurately if there is no forced win available from that position to either player - with proof again - and could and would also qualify instantly and accurately whether that subject position could arise legally from the original 32 piece opening position or not.

So far - I did not yet find a way to properly qualify it in under 12 lines.
Took into account many of the 2000+ posts so far in the forum.

Many might say:  "that is only a definition of 'strong' solving. 
The topic could refer to 'weak' solving."
If the topic - hypothetically  refers to 'weak' solving -
then it could be argued that chess is already solved.
So the topic couldn't be referring to that.

Its the strong solving that hasn't happened and that may never happen.
There's nobody in the forum claiming that the strong solving could happen in the foreseeable future.  (is there such a thing as that ? happy.png)
The term 'foreseeable future' might need 'work'.  

Avatar of Elroch

This paper would probably be the one:

https://deepmind.com/research/publications/2020/mastering-atari-go-chess-and-shogi-by-planning-with-a-learned-model

Avatar of tygxc

#2115
"Can you please post a reference to the exact paper, page and line, where you read those error rates?"
As I wrote before, but you probably did not read:
https://arxiv.org/pdf/2009.04374.pdf
figure 3 page 7
Besides for all the critics here: it are 2 data points, but in fact it are 4 data points: 0 error at infinite time and infinite errors at 0 time. I do not see what else I could fit though those 4 data points.
Maybe I extract too many information from too few data,
but most here seem unable to extract any information from what is available and then jump to their own conclusion without any facts or figures to support that at all.

Avatar of playerafar

@MARattigan
Was going back over posts ...
Saw this from you in post #1907.  Regarding castling rights:

"This would probably add about 12000 new tablebases to the 3000 or so existing tablebases for 3-7 men"

That looks disproportionate.
For castling to be theoretically possible - at least one King would have to be on its home square - combined with at least one of its rooks on that rook's home square.
But those two factors in combination would be present in Far Under 1% of 7-piece and lesser positions.
So the leap from 3,000 to 15,000 tablebases because of one factor in less than 1% of positions - doesn't look valid to me.

Avatar of haiaku
tygxc wrote:

As I wrote before, but you probably did not read:
https://arxiv.org/pdf/2009.04374.pdf
figure 3 page 7 [ . . . ]

You mean figure 2, page 7? But where do you read that those are error rates? They are results of self-play games. Why you infer the error rate from the won or lost games? Errors might have occurred in all games, independently from the result. Besides, you are implicitly assuming that the game-theoretic value is a draw; if the game was a win for White or for Black, following your reasoning we should think that at least one error occurred in all the drawn games.

Avatar of playerafar

In computer self play games - many of the errors would be repeated by both sides.
Or - just inferior moves. 
More likely positionally inferior than tactically inferior.
Or - just moves the computer is in error assigning as 'superior' where there are in fact many comparable moves.  
Computers detecing their own errors? 
Basing the error percentage on the computers' own reports of same?  happy.png

Regarding 'facts and figures' people disagreeing with posts can use the same facts and figures already presented in the posts they're disagreeing with.
There's no 'imperative obligation' to obtain one's own 'peer-reviewed' facts and figures.   
Idea:  present one's own logical arguments instead of parroting stuff from the net.   

Avatar of tygxc

#2120
"You mean figure 2, page 7?" 
Yes, figure 2, my bad.

"you are implicitly assuming that the game-theoretic value is a draw"
++ Yes, I wrote that above: I assume the generally accepted hypothesis that chess is a draw to be true. From that follows that any decisive games contains an odd number of errors, at least 1. The number of decisive games gives the error rate in absolute terms. We do not know what move was the error, but we know it is there. The fact that the draw rate goes up with more time even supports the hypothesis.
To the mathematical nitpickers: I fit
error = a / time^b
and estimate 2 parameters a and b from 2 data points, that is mathematically sufficient but not redundant. In plain English:
time * 60 = error / 5.6

"if the game was a win for White or for Black, following your reasoning we should think that at least one error occurred in all the drawn games"
++ Yes, that is right: if chess were a win for white or even for black, then all the drawn games would contain an odd number of errors, at least 1. That would lead to the odd outcome that more time = more errors.

"Errors might have occurred in all games, independently from the result."
++ That is right, under the hypothesis that chess is a draw a drawn game might contain an even number of errors, at least 2. However at 1 error per 10^5 positions at 60 h/move, the probability of 2 errors would be 1 in 10^10 << 10^5.

#2115
"if errors are not statistically independent, how can you say that P(2nd error | 1st error) ~= P(1st error)?"
++ There is hardly any statistical dependence. If the 1st error allows a checkmate in 1, then that ends the game and there can be no 2nd error, but that is rare. For all practical purposes
P(2 errors) = P(1 error)^2.
I wrote the exact formula with conditional probability
P(error 1 & error 2) = P(error 2|error 1) * P (error 1)
only to avoid mathematical hair splitting.


Avatar of tygxc

#2121

"In computer self play games - many of the errors would be repeated by both sides."
++ There are not that many errors, and the error rate goes down with more time:
time * 60 = errors / 5.6

"Or - just inferior moves. "
++ There are no 'inferior moves' a move is either an error or not: it either changes the game state draw / win / loss or it does not change it.

"More likely positionally inferior than tactically inferior."
++ Positionally inferior = tactically inferior at more depth

"the computer is in error assigning as 'superior' where there are in fact many comparable moves.  "
++ There are no superior moves: all moves that do not change the game state draw / win / loss are objectively equally good

"Computers detecing their own errors? 
Basing the error percentage on the computers' own reports of same?"
Under the generally accepted hypothesis that chess is a draw each decisive game must contain an odd number of errors at least one. So from the number of decisive games we can infer the error rate.

"Regarding 'facts and figures' people disagreeing with posts can use the same facts and figures already presented in the posts they're disagreeing with."
++ Yes, I would wellcome if people backed up their claims with facts and figures. Now it is more like I cannot conclude what I conclude from the facts and figures. Then they jump to their own conclusions without any backing.

"There's no 'imperative obligation' to obtain one's own 'peer-reviewed' facts and figures."
That is correct, there is no obligation and neither is there any ownership of the facts and figures I present. Of course people are free to base their arguments on the facts and figures I present, or on the facts and figures thay can bring from other reputable sources.

"present one's own logical arguments instead of parroting stuff from the net."
++ I do present my own logical arguments and back these up by facts and figures from reputable sources. If I were to use facts and figures without quoting the sources, then people would question those facts and figures. There are even those here who demand page and line for each source I quote.

Avatar of haiaku
tygxc wrote:

"you are implicitly assuming that the game-theoretic value is a draw"
++ Yes, I wrote that above: I assume the generally accepted hypothesis that chess is a draw to be true. From that follows that any decisive games contains an odd number of errors, at least 1. The number of decisive games gives the error rate in absolute terms. We do not know what move was the error, but we know it is there. The fact that the draw rate goes up with more time even supports the hypothesis.
To the mathematical nitpickers: I fit
error = a / time^b
and estimate 2 parameters a and b from 2 data points, that is mathematically sufficient but not redundant. In plain English:
time * 60 = error / 5.6

How can you say the error rate is that function of time? And those two equations are not coherent. Besides, you admit that the games may contain any odd number of errors, but then you make your calculations assuming there is only one, not at least one. In other words, if errors by both sides do compensate each other, that parameter "a" can be any number.

Avatar of goodwitch13

In any math problem you are solving for something...chess is a series of moves...not unlike Algebra or Fractions. You raise a value you remove a value in hopes of solving for X or calculating the next 1/2 move to count for the whole move you intend to make next. Chess isn't solved by who wins and loses. Chess is solved by each piece being a number that is part of infinite equations and each one having their own solution. IE Opening Knight g1 to f3 putting me in position to take pawn at E5. If I am solving for how to take the pawn with Knight I know in 2 moves I am able to do this. As a problem it could be written as such (g1+f3)(f3-e5)=x(pawn) or as (2+2)(2-2)...(+4)(-4)=0(pawn). Another example of this is open move (QPD2+QPD3)=-1qp by moving forward one you lost the space behind you for pawn. Followed by QBC1-KPE5=4(pawn) or (-5+1)=X...(-4)=X, take away 5 spaces gaining a new position of 1. By adding 4 to both sides  X solves for 4 spaces to moved to take pawn. Yes, chess can be solved. It just depends on how you wish to play as a strategist or mathmatician.

Avatar of tygxc

#2140
"How can you say the error rate is that function of time?"
At infinite time the error rate is 0. At zero time the error rate is infinite.
So the simplest monotone function that satisfies those 2 asymptotic boundary conditions is

error = a / time^b

or, if you prefer

log (error) = log(a) - b * log(time) 

"And those two equations are not coherent." ++ What do you mean?

"the games may contain any odd number of errors, but then you make your calculations assuming there is only one"
++ P(5 errors) << P(3 errors) << P(1 error) 

"if errors by both sides do compensate each other, that parameter "a" can be any number."
We are talking about 11.8% decisive games @ 1 s/move and 2.1% decisive games @ 1 min/move. That justifies to neglect the occurence of 3, 5, 7, 9... errors in 1 game, even more so at 60 h/move.
It were different if you took the results of 2 beginners: that would lead to much more decisive games because of more errors per game.
All this relates to the 1st step: inferring the error rate from the autoplay game results.
Estimating the parameters a & b comes from the 2nd step: extrapolating from 1 s/move and 1 min/move to 60 h/move. That is independent from neglecting 3, or 5, or 7 errors and only assuming 1 error as there are only 2.1% decisive games @ 1 min/move.
The 3rd step is converting error/game to error per position, assuming games of 40 moves i.e. 80 positions.
The 4th step is then to infer from that that the top 4 engine moves only produce 1 error in 10^20 positions and thus suffice as only 10^17 positions need consideration.