Chess will never be solved, here's why

Sort:
tygxc

#2499
“Can you post a game where SF14 plays from one of your tabiya to the 7 man tablebases using only ply count 0 positions, please?”
++ 99% of ICCF WC draws are ideal games with optimal moves. None of those games reach 50 moves without capture or pawn move. The rules of the game are such that optimal play compels a trade or a pawn move within 50 moves, except when the men are sparse like in some endgame table base positions like KNN vs. KP.

“If you look at the comment I inserted at the bottom of the first example in #2485, the roundish thing after "Ply count =" was meant to represent 0.”
++ That position counts 5 men, that is well within the strongly solved 7-men endgame table base and far from the uncharted middle game territory of 8 – 26 men.
The position probably can result from a reasonable game with > 50% accuracy e.g. from an endgame KNNP vs. KBBP where the stronger side sacrificed both bishops to queen its pawn.
As the position is a forced win, it cannot result from an ideal game with optimal moves.
The position probably cannot even result from a reasonable game with > 50% accuracy.
How long did you run your desktop for depth 29? Your desktop is 1000 times slower than a cloud engine of 10^9 nodes/s. Time * 60 gives 5.6 times less error. For the same time your desktop errs 5.6 * log(1000) / log(60) = 9.4 times more than the cloud engine.
SF14 apparently misjudges knight pairs. KNN vs. K is a draw, but it is +6.

“Here's another one for you”
++ Yes, this is a better example. It has 7 men, that is the boundary between the uncharted middle game of 8 – 26 men and the strongly solved 7-men endgame table base.
The position probably can result from a reasonable game with > 50% accuracy.
As the position is a forced win, it cannot result from an ideal game with optimal moves.
How long did you run your desktop for depth 33? Your desktop is 1000 times slower than a cloud engine of 10^9 nodes/s. Time * 60 gives 5.6 times less error. For the same time your desktop errs 5.6 * log(1000) / log(60) = 9.4 times more than the cloud engine.
SF14 apparently misjudges knight pairs. KNN vs. K is a draw, but it is +6.

The most representative examples are drawn KRPP vs. KRP. It counts 7 men, at the boundary between the strongly solved 7-men endgames and the uncharted 8 – 26 men middle games. Rook endings are the endgames that occur most. If it is a draw, then it can result from an ideal game with optimal moves. The side a pawn down must play accurately to avoid losing.
How to draw is not obvious like in say KNNP vs. KNN: sacrifice a knight for the pawn.

 

tygxc

#2382
"After spending 60 hours to add to the search only 4 candidate moves for white and one child for each of them (so you said). So after five years how many nodes will have been searched?"
++ I already answered that, but you think my answer was wrong.
If you know the answer, then why do you ask? Are you a teacher?
I now give you 2 wrong answers.

Wrong answer 1:
If we branch b = 4 for m = 39 moves deep, then you could think we need
1 + b + b² + b³ + ... + b^39 = (b^40 - 1) / (b - 1) = (4^40 - 1) / 3 = 10^23 nodes.
This is wrong for 2 reasons:
1) It is clearly too high as there exist no 10^23 legal, sensible, reachable, and relevant positions.
2) Chess has many transpositions
e.g. 1 e4 e5 2 Nf3 Nc6 3 Bc4 = 1 e4 Nc6 2 Nf3 e5 3 Bc4 = 1 e4 e5 2 Bc4 Nc6 3 Nf3

Wrong answer 2:
If to account for transpositions we assume all white moves can be permuted, then we get
1 + b + b²/2 + b³/3! + ... + b^m/m! = exp (b) = exp (4) = 55
This is wrong for 2 reasons:
1) 55 is clearly too low
2) Not all moves can be permuted. E.g. in the above example e4 must precede Bc4.

Two wrongs make one right.
The number of nodes you ask for lies between 55 and 10^23.

tygxc

#2496
"You posted this link several times http://library.msri.org/books/Book29/files/schaeffer.pdf.
It's an MSRI publication on the subject “Checkers can be solved” (written before a solution was claimed). You could try them."

++ Yes, they might accept a paper. That paper was kind of a progress report, after years of work by Schaeffer. I feel no urge to publish. I no longer work at the university. I never worked in the field of game theory.

MARattigan

@tygxc

#2501

We all know you're superwhizz at wrong answers.

We were waiting for a right answer.

(Hint: If 1 man takes 60 hours to dig 1 trench how many trenches can 3 men  dig in 5 years. As in all these primary school questions you can assume they don't need to sleep, eat etc.)

tygxc

#2504
No, that is not right.
As we know from how checkers was solved each node in the proof tree required as many positions as there are nodes in the proof tree.
During 60 hours many positions are visited, not just one.

haiaku
tygxc wrote:

#2382
"After spending 60 hours to add to the search only 4 candidate moves for white and one child for each of them (so you said). So after five years how many nodes will have been searched?"
++ I already answered that, but you think my answer was wrong. [ . . . ]

 

tygxc wrote:

10^18 relevant positions to account for each pawn move and each capture rendering huge numbers of positions irrelevant and 10^17 positions for only tackling the necessary openings take 10^8 seconds on 1 cloud engine of 10^9 nodes/s to weakly solve chess.
GM Sveshnikov was right: 3 cloud engines can weakly solve chess in 5 years.
Exhausting all 10^17 relevant positions is proof.

So

tygxc wrote:

"b) If 60 hours are needed to produce the 4 white nodes and their 4 child black nodes (one for each white node), how many hours are needed to produce the entire tree?"
++ about 5 years

is wrong, unless you think you can search less than 10¹⁷ nodes, to the maximum. You don't know how many nodes will be searched exactly, fine, but at 60 hours to add only 4 candidate moves for white and one child for each of them, what's the greatest number of nodes that can added to the search tree in 5 years? I think it's better if you give the answer yourself.

tygxc wrote:

As we know from how checkers was solved each node in the proof tree required as many positions as there are nodes in the proof tree.
During 60 hours many positions are visited, not just one.

The proof tree is one thing, the search tree is another. The proof tree is determined after the search, not before. After 60 hours you have added 8 nodes (4 nodes for White and one reply each for Black) to the search tree.

MARattigan
tygxc wrote:

#2504
No, that is not right.
As we know from how checkers was solved each node in the proof tree required as many positions as there are nodes in the proof tree.
During 60 hours many positions are visited, not just one.

What is the total number of moves you will have made in five years at 60 hours per move? (You're not saving the moves that SF14 considers internally in its evaluations.)

Not very hard to work out. If you can't do it, just say, and I'll give you the answer.

After that we can look at the number of moves that will need to appear in your solution and the number of moves you will need to make in arriving at your solution.

tygxc

#2506
"Exhausting all 10^17 relevant positions is proof"
++ 10^9 positions/s/engine * 3 engines * 3600 s/h * 24 h/d * 365.25 d/a * 5 a = 4.7 * 10^17 positions

#2507
"You're not saving the moves that SF14 considers internally in its evaluations"
++ Of course an engine does not start from scratch each move, but stores some variations.
It does not store everything as it prunes.

By the way my preferred embodiment would not be to grow the whole tree from its 26-men root towards its top, but rather first start from the trunk, e.g. a drawn ICCF WC game and then prove all white moves optimal from the top down to the root,
first adding a 2nd branch to each node in the 1st verification pass,
then adding a 3rd branch to each node in a 2nd verification pass,
and finally adding a 4th branch to each node in the 3rd verification pass.

haiaku

That does not save you from determinig which are those candidates, in the first place. You cannot know which is the best candidate, without knowing the other 3: you must have ordered them before. So, 60 hours for 4 White's candidates...

MARattigan

@tygxc

#2509

Does all that mean you found the sum I set too hard?

tygxc

#2509
"That does not save you from determinig which are those candidates, in the first place."
Yes, that is right: 60 hours for 1 tree node with 4 white's candidate moves.
3 engines * 5 a/engine * 365.25 d/a * 24 h/d / 60 h/tree node = 2191 tree nodes.
Each tree node looking at 4.7*10^17 / 2191 = 2*10^14 positions.
Looks like few tree nodes and much positions/node.
In solving checkers tree nodes = positions/tree node.
Maybe I should shorten the time per move.
I extrapolated 60 h/move from AlphaZero. AlphaZero probably did not run at 10^9 nodes/s I have to look if it is in the paper.
I also thought of ICCF WC: they play at 5 days/move i.e. 2 * 120 h = 240 h/move. They probably do not rent 10^9 cloud engines but run some multicore engines in their basement.

haiaku
tygxc wrote:

#2509
"That does not save you from determinig which are those candidates, in the first place."
Yes, that is right: 60 hours for 1 tree node with 4 white's candidate moves.
1) 3 engines * 5 a/engine * 365.25 d/a * 24 h/d / 60 h/tree node = 2191 tree nodes.
2) Each tree node looking at 4.7*10^17 / 2191 = 2*10^14 positions.
3) Looks like few tree nodes and much positions/node.
In solving checkers tree nodes = positions/tree node.
Maybe I should shorten the time per move.
4) I extrapolated 60 h/move from AlphaZero. AlphaZero probably did not run at 10^9 nodes/s I have to look if it is in the paper.
I also thought of ICCF WC: they play at 5 days/move i.e. 2 * 120 h = 240 h/move. They probably do not rent 10^9 cloud engines but run some multicore engines in their basement.

1) Finally! Let's say they are some thousands.

2) Whaaat? "Each node looking at [ . . . ] ": when that "looking" should be performed? If the engine visits 2*10¹⁴ nodes during the search for those 4 candidate moves, that doesn't mean it solved the node: it visits those 2*10¹⁴ nodes to find the order for the 4 best children and prune the others; then, if the node is not solved the search must go on, expanding each of those children in 60 more hours. So in 5 years the search tree would have only those mostly unsolved few thousands nodes.

3) Positions at each ply are nodes in the search tree. The proof tree is obviously smaller than the searched tree. Forget about the proof tree, you have to perform the search first.

4) Of course A0 does not run at that speed. In the match against SF it run 1000 times slower than SF.

Antichess has been solved visiting 10¹⁶ nodes and you dream to do the same for chess, but antichess is a capture-compulsory variant where:

"Approximately 87% of the internal nodes with black-to-move have only one legal move, which is perhaps typical for a game with long sequences of forced captures." ¹

It's an impossible parallel. It's probably necessary to visit much more than 10¹⁶ nodes to weakly solve chess, especially if the game value is a draw.

¹ http://magma.maths.usyd.edu.au/~watkins/LOSING_CHESS/ICGA2016.pdf

MARattigan
tygxc wrote:

#2509
"That does not save you from determinig which are those candidates, in the first place."
Yes, that is right: 60 hours for 1 tree node with 4 white's candidate moves.
3 engines * 5 a/engine * 365.25 d/a * 24 h/d / 60 h/tree node = 2191 tree nodes.
Each tree node looking at 4.7*10^17 / 2191 = 2*10^14 positions.
Looks like few tree nodes and much positions/node.
In solving checkers tree nodes = positions/tree node.
Maybe I should shorten the time per move.
I extrapolated 60 h/move from AlphaZero. AlphaZero probably did not run at 10^9 nodes/s I have to look if it is in the paper.
I also thought of ICCF WC: they play at 5 days/move i.e. 2 * 120 h = 240 h/move. They probably do not rent 10^9 cloud engines but run some multicore engines in their basement.

You did eventually reach 2191 (V.G.). That would be a quarter of the number of moves that could potentially be included in your solution (with tiny probability for each I would guess, but I'll ignore that for the moment). 

In 60 hours you extract only 4 moves. So the number of moves your program considers after 5 years would be 8764. (The number of moves SF14 would consider is, of course, much greater, but they don't get included in your solution.)

What would you estimate to be the smallest possible number of moves in what you regard as a complete weak solution from the starting position constructed by your proposed method ? 

(Hint: Schaeffer estimates the number of positions in his solution from the 24 man start of checkers to the 10 man tablebases to be 10^14. You might expect a few more in chess from the 32 man start to the 7 man tablebases.)

4KW

Welcome!!!

playerafar


And there it was again - unable to hide in those posts .
The Immortal:
' 10∧9 Cloud computers '
Will forever be pushed.

But @Elroch had posted something constructive about defining what a node is.
But to describe the relevant math right - it has to connect to the actual practicality and impracticality.
The actual number of computer Operations needed to do whatever with a position.
And 'consider' has to be properly defined.

If somebody knows what's really going on with that - it should be describable with no hiding behind 'nodes per second' and 'consider'.
Should represent directly - how many Ops needed to even find out what moves are available in a position.
And how that number of ops needed increases according to how many pieces are on the board and what pieces they are.
And how many ops needed to even 'examine' calculations just two ply deep.  
But there's going to be hiding behind cosmetic terms.

Elroch
MARattigan wrote:

[snip]

(Hint: Schaeffer estimates the number of positions in his solution from the 24 man start of checkers to the 10 man tablebases to be 10^14. You might expect a few more in chess from the 32 man start to the 7 man tablebases.)

"a few more". grin.png

MARattigan
tygxc wrote:

#2499
“Can you post a game where SF14 plays from one of your tabiya to the 7 man tablebases using only ply count 0 positions, please?”
++ 99% of ICCF WC draws are ideal games with optimal moves. None of those games reach 50 moves without capture or pawn move. 

Do you not mean 0%?

SF14 didn't manage any ideal games from the much simpler starting position here. The moves in the ICCF games are probably all by either SF14 or something that would usually lose to SF14.

You don't say what percentage of the ICCF games consisted solely of ply count 0 positions.

The rules of the game are such that optimal play compels a trade or a pawn move within 50 moves, except when the men are sparse like in some endgame table base positions like KNN vs. KP.

Which specific rules would those be?

If you consider positions that are closely matched in material, specifically within one pawn notional value, there are no winning positions under basic rules that are not also won under competition rules until there are at least 6 men on the board.

With 6 men on the board 0.86% of such basic rules winning positions cannot be won under competition rules.

With 7 men on the board 1.76% of such basic rules winning positions cannot be won under competition rules. 

With your unique gift you can extrapolate from those two data points to prove the figure is 0% with 8 or more men on the board.

For my part I wouldn't be so bold, but I would be surprised if the figure doesn't increase with extra men, at least for 8 men and, I would guess, as long as Haworth's law holds good.

“If you look at the comment I inserted at the bottom of the first example in #2485, the roundish thing after "Ply count =" was meant to represent 0.”
++ That position counts 5 men, that is well within the strongly solved 7-men endgame table base and far from the uncharted middle game territory of 8 – 26 men.

I have to choose positions that are tablebased otherwise I can't prove the SF14 choices are errors. Why is the number of men significant? SF14 does worse with more men. (It's sh*t hot with K+R vs K or any three man position.)
The position probably can result from a reasonable game with > 50% accuracy e.g. from an endgame KNNP vs. KBBP where the stronger side sacrificed both bishops to queen its pawn. As the position is a forced win, it cannot result from an ideal game with optimal moves. The position probably cannot even result from a reasonable game with > 50% accuracy.

If you say so. I'll believe you when you've successfully completed my challenge in this postTotally irrelevant anyway.

How long did you run your desktop for depth 29? Your desktop is 1000 times slower than a cloud engine of 10^9 nodes/s. Time * 60 gives 5.6 times less error. For the same time your desktop errs 5.6 * log(1000) / log(60) = 9.4 times more than the cloud engine.
SF14 apparently misjudges knight pairs. KNN vs. K is a draw, but it is +6.

I didn't time it. I just ran each till nothing was changing. With a logarithmic increase in search depth with time I didn't fancy staring at an unchanging screen indefinitely. 

All but the first of my examples are drawn under the competition rules for which SF14 is designed so there would be no reason for it to change its mind.

By all means try it on your own desktop, or your supercomputer when you get it (but try to get it right this time).

“Here's another one for you”
++ Yes, this is a better example. It has 7 men, that is the boundary between the uncharted middle game of 8 – 26 men and the strongly solved 7-men endgame table base.
The position probably can result from a reasonable game with > 50% accuracy.
As the position is a forced win, it cannot result from an ideal game with optimal moves.

I'll believe that too when you complete the challenge referred to above.
How long did you run your desktop for depth 33? Your desktop is 1000 times slower than a cloud engine of 10^9 nodes/s. Time * 60 gives 5.6 times less error. For the same time your desktop errs 5.6 * log(1000) / log(60) = 9.4 times more than the cloud engine.
SF14 apparently misjudges knight pairs. KNN vs. K is a draw, but it is +6.

See above. 

With the possible exception of the first, SF14 is not misjudging knight pairs in the examples I gave (though none of the SFs can do any mates in 50 in KNNKP from ply count 0). 

It's "misjudging" them under your new rules, but that's mainly because it's not designed for your new rules.

The most representative examples are drawn KRPP vs. KRP. It counts 7 men, at the boundary between the strongly solved 7-men endgames and the uncharted 8 – 26 men middle games. Rook endings are the endgames that occur most. If it is a draw, then it can result from an ideal game with optimal moves. The side a pawn down must play accurately to avoid losing.

Representative of what? Games at your local chess club or perfectly played games?

If the starting position is a win for one side, as seems likely, perfect play could easily be a game of millions of moves. You probably don't see many of those at your local chess club either.

I gave you a KRPP vs. KRP position anyway.
How to draw is not obvious like in say KNNP vs. KNN: sacrifice a knight for the pawn.

So what would be the obvious moves to do that in this position I posted earlier?

Black to play and mate in 60

 

And what's the relevance to solving chess anyway?

 

Elroch

@MARattigan, do you think minimax pathology can be avoided using a probabilistic formulation?

As someone who has spent a lot of time on machine learning / AI, it is far more natural to me to think of an evaluation function like the expected score of a game. This is conducive to learning in a reinforcement learning paradigm where the reward is the score at the end of a game (it is also possible to complicate this slightly when the probabilities are of all three possible results and the learned model can apply a "contempt" parameter to play as if draws are worth more or less than 0.5).

In this model there is certainly still a risk of misleading results if the probabilities are viewed as definitively indications of better or worse moves rather than merely being indicative of whether it is likely that one move is better.

This is much simpler in this paradigm. For example, suppose there are two moves and one has an evaluation of 0.75 and the other has an evaluation of 0.25. Pretend for simplicity that draws don't exist. Then the first evaluation says there is a 1 in 4 chance the move leads to a loss, while the second says there is a 1 in 4 chance the move leads to a win. It is possible the second move is better. Unfortunately, it is not clear whether the uncertainty is independent in general. To deal with interactions you would need to have 4 probabilities instead of 2 (the probabilities of all combinations of results from the two moves).  This increases in complexity exponentially with the number of moves.

One thing that seems important to me is to allow for the fact that the probabilities from less explored branches don't have the same significance as those from well-explored branches. For example, for one move you might just have a bare evaluation of 0.75, and for another you might have a tree with a thousand nodes which leads to a minimax value of 0.8. Does that make the second move better or does the random noise in the model outweigh the slight numerical superiority?  I really should know how to systematically deal with that!

playerafar

Related to that:  the actual ratio of winning/won positions to drawing positions in the Nalimov tablebases.

Its probably published.  If its not - it should be.

MARattigan
playerafar wrote:

Related to that:  the actual ratio of winning/won positions to drawing positions in the Nalimov tablebases.

Its probably published.  If its not - it should be.

Mostly available on the syzygy-tables.info site.

The 6+1 and 5+1 DTM figures  are absent. I don't think the tablebases have been generated - They're also absent from the ICGA stats.

For longest/average mate lengths you need to use the ICGA stats (https://icga.org/icga/games/chess/endgames.php "download the spreadsheet").

Here is a basic rules graph of what you're asking about for positions with a material difference of 1 pawn or less.

The four points on the left are Nalimov values for 4 to 7 men. (I omitted 3 men because that includes only the endgame KPK and is misaligned.)

The value at 32 men is chosen to maximise the adjusted R² value for the 5 points.

I chose an exponential fit because the curve must flatten out if it's not to give a negative percentage of draws beyond about 10 men.

The value at 32 men must obviously be taken with a large pinch of salt, but in the absence of other data would be my best guess.