Chess will never be solved, here's why

Sort:
tygxc

#2098
"How much does one have to secretly pay 'peer reviewers' to get them to favorably 'peer-review' you ?"
++ That is not possible. Only the editor knows to which peers he anonymously sends the manuscript for a review, the author does not know. The reviewers only see the paper, they do not know who is the author. After the review the author still does not know who has reviewed his paper, though some reviewers leave some hidden hints in their comments.
If I were to submit a paper "On the number of sensible chess positions", then the editor presumably would send it to Labelle, Tromp, and Gourion for a review.
If I were to submit a paper "Chess is solved", then the editor presumably would send it to van den Herik, Allis, Allen, Walker, Schaeffer, and Gasser for a review.

llama51
playerafar wrote:

This next one might be a better example.
Still not ideal - but maybe a lot 'closer'.  
Dr. Joseph Mercola - perhaps the biggest quack in all of human history -
even he - has been supported by 'peer review' ...

http://marktaliano.net/peer-reviewed-manuscript-concludes-that-cdc-massively-inflates-covid-19-case-and-death-numbers-with-creative-statistics/

How much does one have to secretly pay 'peer reviewers' to get them to favorably 'peer-review' you ?
And as I read further about peer reviews - there's even the so-called 'anonymous' 'peer reviews'.  

I mean... there are other metrics involved.
In which journal was it published? How many people have cited it? How many papers has this person published? How many citations does he have?

And even with Jan Hendrik Schön, science is self correcting. That's the whole point. If you want blind trust in authority visit a local church.

Javiisern

lol

llama51
tygxc wrote:

#2086

"As the error rate is what low?"
++ At 1 s/move: 88.2% draw = 11.8% error / game = 1 error / 679 positions
At 1 min/move: 97.7% draw = 2.3% error / game = 1 error / 3478 positions
Extrapolating: at 60 h/move: 1 error / 10^5 positions

What's the argument for why error rate is a linear function of time (from 1s/move all the way to 60h/move no less)?

Intuitively, thinking time has diminishing returns. The deeper the search, the more opportunity for error, particularly when the software is not tuned or tested for months-long games.

Also, this could only be a minimum or relative error rate correct? Without knowing one player's error rate it should be impossible to estimate the other's based purely on draw rate.

tygxc

#2106
"why error rate is a linear function of time"
++ I did not assume a linear dependence.
On the contrary I assumed logarithmic dependence:
at 1 s / move:      1 error / 679 positions
at 1 min / move:  1 error / 3478 positions
at 1 h / move:      1 error / 3478 * 3478 / 679 positions = 1 error / 17815 positions
at 60 h / move:    1 error / 17815 * 17815 / 3479 positions = 1 error / 91251 positions

 "this could only be a minimum error rate "
++ As the error rate is that low, the occurence of two or more errors can be neglected.
P(2 errors) = P(2 errors|1 error)*P(1error) ~= P(1 error)^2 << P(1 error)

"this could only be a relative error rate"
++ By the generally accepted hypothesis that chess is a draw each decisive game must contain at least 1 absolute error: a move that turns a drawn position into a lost one

tygxc

#2104
Yes, that is correct.

Every editor of a mathematics journal receives some proofs of the Riemann Hypothesis.
Most get stopped by the editor himself or by the peer reviewers.

Some made it to publication, but then some reader pointed out a flaw and the paper had to be officially retracted, which reflects badly on the author, on the reviewers, and on the editor.
The famous paper on cold nuclear fusion was also published and had to be retracted.

On the other hand the PhD dissertation of Einstein was originally rejected.
Einstein published his most famous paper on relativity in the relatively (no pun) obscure journal Annalen der Physik to bypass incompetent reviewers and editors of the bigger journals.

llama51
tygxc wrote:

#2106
"why error rate is a linear function of time"
++ I did not assume a linear dependence.
On the contrary I assumed logarithmic dependence:
at 1 s/move:      1 error/679 positions
at 1 min/move: 1 error / 3478 positions
at 1 h/move:      1 error / 3478 * 3478 / 679 positions = 1 error / 17815 positions
at 60 h/move:    1 error / 17815 * 17815 / 3479 positions = 1 error / 91251 positions

 "this could only be a minimum error rate "
++ As the error rate is that low, the occurence of two or more errors can be neglected.
P(2 errors) = P(2 errors|1 error)*P(1error) ~= P(1 error)^2 << P(1 error)

"this could only be a relative error rate"
++ By the generally accepted hypothesis that chess is a draw each decisive game must contain at least 1 absolute error: a move that turns a drawn position into a lost one

Eh, I don't know why I said that. Yeah, the rule you used is when the input is x60 the output is x5... that's obviously not linear... but I still don't know the logic for why that works other than you had 2 data points and are just playing with ratios.

 

As for draws, oh I see, you're saying draws that happen after zero errors are much more likely than draws that happen after 2, so we can just ignore games with multiple errors.

Eh... isn't this making a lot of assumptions? For example let's say the error rate is not in the single digits for either engine, and the winner is routinely committing multiple fewer errors. Why is this scenario unlikely? Current SF is something like 300 points stronger (link below) than the one that played AZ (SF8 played AZ). Is it really sensible that the error rate was so low 5 years ago when today it doesn't require 100s of games to determine which engine is best for e.g. CCRL 40/40?

https://www.chessprogramming.org/images/0/04/SfElo.png
https://ccrl.chessdom.com/ccrl/4040/

 

tygxc

#2109

"you had 2 data points"
++ Yes, I only have 2 and try to make best use of those.

"let's say the error rate is not in the single digits for either engine"
++ Well, the error rate is that low and gets lower with more time

"today it doesn't require 100s of games to determine which engine is best"
In the TCEC superfinals they have to impose slightly unbalanced openings to avoid all draws
They also give all information about evaluation, nodes/s, 7-men endgame table base hits
https://tcec-chess.com/ 

MARattigan
tygxc wrote:

#2106
"why error rate is a linear function of time"
++ I did not assume a linear dependence.
On the contrary I assumed logarithmic dependence:
at 1 s / move:      1 error / 679 positions
at 1 min / move:  1 error / 3478 positions
at 1 h / move:      1 error / 3478 * 3478 / 679 positions = 1 error / 17815 positions
at 60 h / move:    1 error / 17815 * 17815 / 3479 positions = 1 error / 91251 positions

 "this could only be a minimum error rate "
++ As the error rate is that low, the occurence of two or more errors can be neglected.
P(2 errors) = P(2 errors|1 error)*P(1error) ~= P(1 error)^2 << P(1 error)

"this could only be a relative error rate"
++ By the generally accepted hypothesis that chess is a draw each decisive game must contain at least 1 absolute error: a move that turns a drawn position into a lost one

As I pointed out here, SF14's recommendations after 50 moves in any phase will be purely random because its evaluation of the position after any move will be exactly 0.00 (unless the move happens to be mate).

So the error rate will then be (moves available - perfect moves available) / (moves available), where perfect moves relates to the perfect moves in your no 50 move rule game.

Apart from not proving your generally accepted hypothesis (part of the point of the exercise) you have taken "at least 1" to mean "at most 1" (not to mention what you do with it after that).

It should tell you something that trying it out in a situation where the error rate can be accurately determined (as I did here) gives an average error rate of 1 error / 32 positions with a general tendency to increase with think time.  Rather a far cry from 1 error / 91251 positions.

Elroch
tygxc wrote:

#2109

"you had 2 data points"
++ Yes, I only have 2 and try to make best use of those.

grin.png

You seem to miss the fact that the best you can do with inadequate data is to fail to draw a conclusion. More "trying" doesn't change that.

haiaku
llama51 wrote:

Eh... isn't this making a lot of assumptions?

"A lot" really underestimates the number of assumptions @tygxc makes to support his claims. happy.png

Elroch

There is another fact which makes all this nonsense invalid. It is that (especially if the true value of chess is a draw) real world chess strength (measured by Elo) is not merely about playing accurately according to absolute chess truth (a 32-piece tablebase). This would suffice in winning positions, but in drawing positions it is about giving an imperfect opponent the opportunity to make mistakes.

To demonstrate this, observe that to a tablebase there are no preferred drawing moves in a drawing position. These are all correct.

So consider the following strategy for a player with a 32 piece tablebase - in a theoretically drawing position, including maybe the starting position, pick a move which is tablebase drawing but which has the lowest evaluation according to some engine. Now play a weak player and tell the player that he should play cautiously, avoiding leaving pieces hanging and to keep his king safe.

The result is likely to be a draw, the weak player being able to find a move that doesn't turn a more active drawing position into a losing one.

The relevance to the consideration of error rates is that the error rate is highly dependent on the opponent rather than being fixed for a player. A strong player gives a weak player opportunities to make mistakes rather than merely preserving a theoretical draw.

If you have never played a 4000-rated player, you can't assume your error rate against it will be the same as against a 3300-rated player.  And no-one has played a 4000-rated player.

haiaku

I agree. The WDL percentages that an evaluation function returns do not measure the probabilities that the theoretic value of the game is a win rather than a draw or a loss, imo. They measure how difficult it is for a player to handle the position in practice. I don't think we can infer the value of the game from that.

Elroch

A good example of this is that the Berlin defense has been viewed to be an inpenetrable dead draw at the highest level of human chess, producing games that likely never have one side reaching a winning position. @tygxc might guess an error rate near zero in these games, and he might be right. 

Yet if a world champion plays the Berlin defense against Stockfish 14 and tries for a draw, he is going to lose the large majority of games.

haiaku
tygxc wrote:

As the error rate is that low, the occurence of two or more errors can be neglected.
P(2 errors) = P(2 errors|1 error)*P(1error) ~= P(1 error)^2 << P(1 error)

What is that, conditional probability? You mean:

P(2nd error | 1st error) = P(1st error and 2nd error)/P(1st error) and thus

P(1st error and 2nd error) = P(2nd error | 1st error) * P(1st error)?

But if you consider the errors statistically independent and having same probability, no need to use conditional probability:

P(1st error and 2nd error) = P(2nd error) * P(1st error) = P²(1st error). On the other hand, if errors are not statistically independent, how can you say that P(2nd error | 1st error) ~= P(1st error)? I don't understand exactly.

tygxc wrote:

At 1 s/move: 88.2% draw = 11.8% error / game = 1 error / 679 positions
At 1 min/move: 97.7% draw = 2.3% error / game = 1 error / 3478 positions

I infer that by extrapolation from the peer reviewed Alpha Zero paper. I know the published error rate at 1 s/move and 1 min/move [ . . . ]

Can you please post a reference to the exact paper, page and line, where you read those error rates?

playerafar


Again regarding the forum topic - an idea -
Which is not what 'solved' means or should mean -
but what it could mean.  Multiple possibilities.
Some of those possibilities have concrete answers.

     Will any human ever reach a point where somebody shows him or her a position and then without any computer assistance the person could instantly know and say with total accuracy whether the position is a win and for whom or - its a draw?  

The answer is No. 
There's under 32 million seconds in a year and under ten billion seconds in a human lifetime.  
Even Capablanca wouldn't be up to it ! 
Unless you could reincarnate him and then make him immortal.
Then - it would take him some time and would he be interested?  happy.png

So - the opening post refers to computers then.  
But then there's still lots of possibilities of what the forum title could refer to.
1) It could refer to 'any chess position'. 
2) Or - to the opening position and its 20 options for White.

1) Could a point be reached where any chess position is shown to a computer and it can instantly pronounce whether a win is available or not and to which player?
2) Is that completely equivalent to - the 20 possible opening moves for white would each have been determined by computers to be either wins or losses or no win available to either player ?

Are 1) and 2) above equivalent to each other ?  Maybe. 
But that would mean the computers would also have to have taken care of positions that may or might not be legally possible to have arisen from the opening position.  In order for the two statements to be equivalent.

But many people might prefer one or the other.
Or might argue that if either of them is established - then the other one is too.

But another interpretation is that you couldn't have (2) without (1).
At least one of the situations needing the other situation.
That looks much more to the point.

So now - next prototype of things the forum topic might refer to intentionally or unintentionally:

Prototype:
Chess is 'solved' by computers if a situation is reached in which if any chess position is shown to the computers and they could and would then instantly and accurately indicate whether a forced win is available from that position to either of the two players and to which of the two players that win is available to and also qualify in each of those cases as to whether such win takes more or less than 50 moves with no captures nor pawn moves or not - and have proof available - and they could and would also indicate instantly and accurately if there is no forced win available from that position to either player - with proof again - and could and would also qualify instantly and accurately whether that subject position could arise legally from the original 32 piece opening position or not.

So far - I did not yet find a way to properly qualify it in under 12 lines.
Took into account many of the 2000+ posts so far in the forum.

Many might say:  "that is only a definition of 'strong' solving. 
The topic could refer to 'weak' solving."
If the topic - hypothetically  refers to 'weak' solving -
then it could be argued that chess is already solved.
So the topic couldn't be referring to that.

Its the strong solving that hasn't happened and that may never happen.
There's nobody in the forum claiming that the strong solving could happen in the foreseeable future.  (is there such a thing as that ? happy.png)
The term 'foreseeable future' might need 'work'.  

Elroch

This paper would probably be the one:

https://deepmind.com/research/publications/2020/mastering-atari-go-chess-and-shogi-by-planning-with-a-learned-model

tygxc

#2115
"Can you please post a reference to the exact paper, page and line, where you read those error rates?"
As I wrote before, but you probably did not read:
https://arxiv.org/pdf/2009.04374.pdf
figure 3 page 7
Besides for all the critics here: it are 2 data points, but in fact it are 4 data points: 0 error at infinite time and infinite errors at 0 time. I do not see what else I could fit though those 4 data points.
Maybe I extract too many information from too few data,
but most here seem unable to extract any information from what is available and then jump to their own conclusion without any facts or figures to support that at all.

playerafar

@MARattigan
Was going back over posts ...
Saw this from you in post #1907.  Regarding castling rights:

"This would probably add about 12000 new tablebases to the 3000 or so existing tablebases for 3-7 men"

That looks disproportionate.
For castling to be theoretically possible - at least one King would have to be on its home square - combined with at least one of its rooks on that rook's home square.
But those two factors in combination would be present in Far Under 1% of 7-piece and lesser positions.
So the leap from 3,000 to 15,000 tablebases because of one factor in less than 1% of positions - doesn't look valid to me.

haiaku
tygxc wrote:

As I wrote before, but you probably did not read:
https://arxiv.org/pdf/2009.04374.pdf
figure 3 page 7 [ . . . ]

You mean figure 2, page 7? But where do you read that those are error rates? They are results of self-play games. Why you infer the error rate from the won or lost games? Errors might have occurred in all games, independently from the result. Besides, you are implicitly assuming that the game-theoretic value is a draw; if the game was a win for White or for Black, following your reasoning we should think that at least one error occurred in all the drawn games.