Chess will never be solved, here's why

Sort:
Elroch

First, I acknowledge that the actual results of games between imperfect players from positions don't tell you what the tablebase value is. But I conjecture that for a lot of the sort of positions where chess experience is most relevant, they are likely to be some sort of indication. People lose dead lost openings a lot. Top evenly matched players probably draw theoretically drawn positions a lot. Not proof, but seems most likely (like the opening position being a draw is a reasonable conjecture based on white's 54% score in master play (or whatever it is).

Leela has learnt to estimate the expected result in positions reached in games and an informal observation is that a pawn according to Stockfish is somewhere around a 70% expectation.

A random example is not so far from that:

 
Stockfish evaluation: -1.23 pawns (depth 32 ply)
Leelazero evaluation: 24.3% (3 million nodes), i.e. 65.7% for black

 

Elroch

[This post replied to a now deleted post]

I am not and I acknowledge that you are right to question this. It's based on a loose notion of randomness of the errors producing results. This notion can be quantified.

To a tablebase you can think of there only being 3 classes of error, all of which change the value. There are of course no errors for a losing player, one class of error for a player in a drawing position, and two classes of errors for a player in a winning position.  With some reasonable assumptions about the statistics of these errors, you will find it implausible that a class of losing positions usually end up as draws. (The statistics needed to achieve this would be that errors turning a win into a draw are much more common than those turning a draw into a loss. If the frequencies of different types of errors are comparable, you would find that if you have enough errors to turn most losses into draws, you would have quite a few wins for the other side too). Of course, this also provides a loose argument that the large number of draws in top level games does indeed mean the value of chess is a draw. Elsewhere you will find me pointing out that this argument leaves uncertainty so doesn't tell us the value for sure. I am acknowledging this uncertainty here as well.

playerafar
Elroch wrote:

Here's a legal position with 9.19 pawn advantage for black according to 99-ply Stockfish 14.1 analysis. White's drawing strategy is not difficult.

 

 

How could Stockfish assign an advantage of +9 there?
Chess diagrams don't copy - but in the quoted post the White king simply keeps black's King immobile - which paralyzes black's material advantage of two rooks and a bishop. 

The strongest engines are stronger than top GM's these days.
But for solving projects and chess research would engines be used that make that kind of mistake of failing to factor position versus material ?

If engines are still that Primitive (hard to believe) - and that persists - then Indeed chess will never be solved on this planet.
If aliens come from this galaxy or from another (or from another Big bang) who have a 'solution' ...  then programmers' reaction to the aliens' algorithm:  could be what?
What could they offer the alien?
Alien:  "Don't worry.  You've got more troubles with global warmings and too many nukes and other problems.  If we offered you the chess algorithms - that's not going to motivate you to do anything about those other things."

So - another thing coming out of the discussion -
that Stockfish could be So Inaccurate ...
@Elroch 's contribution to be added to that of @MARattigan 's about the analysis button indicating a win where white can draw.

Elroch

The reason Stockfish is not good at evaluating anomalous positions like that is that no-one has prioritised it in Stockfish development. Playing better in any position that could arise in a game without the result being clear is worth investing time in for the main purpose of being a strong player.

Elroch

Regarding probabilistic evaluation and the exciting feature of Leela that probabilities of all 3 results can be predicted (and how that changes the view of a game as it progresses), here is a very interesting blog post:

https://lczero.org/blog/2020/04/wdl-head/

playerafar
Elroch wrote:

Regarding probabilistic evaluation and the exciting development in Leela where probabilities of all 3 results can now be predicted (and how that changes the view of a game as it progresses), here is a very interesting blog post:

https://lczero.org/blog/2020/04/wdl-head/

I clicked on that.  Read some of it.
'Probability' versus material imbalance ...
I would say todays chess computers do factor in some position into their evals.
The idea of advantage of +1 corresponding to one pawn up or material equivalent is basically flawed right away?  Probably. 
but its apparently factored in because people 'can relate to that.'

Noting in the article that AlphaZero used 'probability'.
There was quite a stir when AlphaZero first came out.

I think people can relate to odds and probability -
but many chessplayers may not like that though.
And many players prefer to see chess as removed from luck and money odds.
But for chess research projects - why not use probability?
In the current eval schemes - like on Stockfish -
what would be the difference between +30 and +50?

pfren
Elroch wrote:

Here's a legal position with 9.19 pawn advantage for black according to 99-ply Stockfish 14.1 analysis. White's drawing strategy is not difficult.

 

 

 

This is after half a second.

 

 

 

Stockfish is quite slower- it needs 1.3 seconds.

 

 

 

This means that your 99-ply Stockfish is most probably either drunk, or extremely poorly configured.

Elroch

The reason evaluations on traditional engines are in centipawns is because they started with the standard piece values then added on a load of fudge factors for positional factors. It's a horrible mess because it is possible and there was no obvious alternative that was practical.

The reason evaluations on neural networks are in probabilities is because the rules of chess say the aim is to win and that a draw is of intermediate value (usually half a win).  It's just the obvious way to do it, and closely related to what is done in the most common and best developed branches of machine learning.

playerafar

When AlphaZero first came out - it crushed !
Apparently not only building on probability instead of material values but also organizing its calculations better.
That's one of the marks of a better player.
Which calculations to do early. 
To do that right - observation needs to be distinguished from calculation.

playerafar

Looks like a postaround there.
That's one of the good things about @tygxc - and the people responding to him or otherwise posting - they are neither deterred nor baited.

tygxc

#2776

"To a tablebase you can think of there only being 3 classes of error, all of which change the value. There are of course no errors for a losing player, one class of error for a player in a drawing position, and two classes of errors for a player in a winning position."
++ Yes, that is all correct

"With some reasonable assumptions about the statistics of these errors, you will find it implausible that a class of losing positions usually end up as draws. (The statistics needed to achieve this would be that errors turning a win into a draw are much more common than those turning a draw into a loss."
++ Yes, indeed, that is right.
Even more, when the probability of error is low, say 1 error in 100 games, like in ICCF WC draws, then a slight difference in the error rate say 0.65 errors in 100 games from draw to loss and 0.35 errors in 100 games from won to draw cannot explain the observed data.
Also as the initial position is a draw, an error from won to draw requires a previous error from draw to loss. So errors from draw to won can occur alone, or together with an error from won to draw or a blunder from won to lost.
My take is that error rate from draw to won = error rate from won to draw,
and blunder rate from won to lost = (error rate from draw to lost)².
If the error rate is low like 1 error in 100 games, like in ICCF WC draws, then this is right in itself. Even if it is not entirely right, then the result is correct if the error rate is low enough.

"If the frequencies of different types of errors are comparable, you would find that if you have enough errors to turn most losses into draws, you would have quite a few wins for the other side too)."
++ Yes, that is right.
That is what I based my calculation of the error rate E from the decisive rate D on.
E = Sqrt(1 + 1 / (2*D)²) - 1 / (2*D)

"Of course, this also provides a loose argument that the large number of draws in top level games does indeed mean the value of chess is a draw."
++ This is indeed one of the arguments for the game-theoretic value of chess being a draw.
It is not only the large number of draws, but also that the draw rate rises each year in ICCF WC games. ICCF games allow table base win claims that exceed 50 moves, so ICCF is a more decisive game than standard chess.
Another argument is that the draw rate goes up with more time/move in AlphaZero autoplay to approach all draws. Even if the rules are changed to stalemate = win, the same holds true for the resulting more decisive game.
The results cannot be attributed to random fluctuations, as calculations for 4 unsound openings (Chigorin Defence, Dutch Defence, Alekhine Defence, King's Gambit) show.
Another argument is that TCEC approaches all draws, so they had to impose slightly unbalanced openings to avoid all draws. Even with the slightly unbalanced openings the draw rate is still high. This is remiscent of checkers, where in human competition they imposed openings considered slightly unbalanced to avoid all draws, long before checkers was proven a draw. So checkers was known to be a draw before it was proven.

haiaku
tygxc wrote:

This is remiscent of checkers, where in human competition they imposed openings considered slightly unbalanced to avoid all draws, long before checkers was proven a draw. So checkers was known to be a draw before it was proven.

Your unscrupulous use of the word "known" might be applied to your beloved antichess too, where 1. e3 was "known" to be a draw, and has been proven to be a win.

tygxc

#2787
"1. e3 was "known" to be a draw, and proven to be a win"
++ Losing Chess was not played or analysed as intensively as Checkers or Chess.
"many people (including chess grandmasters who had made some study of the game) told me that they had thought that 1. e3 would only draw, particularly against c5 or b6"
"my whole involvement in the project was a sort of “gamble” that White would win in the end,
as else there would be little chance of completion"
So many people thought it a draw,
but at least one gambled that white wins and he solved the game.
http://magma.maths.usyd.edu.au/~watkins/LOSING_CHESS/ICGA2016.pdf 

haiaku

Indeed, because what for others is "thought", for you is "known". Can you provide, as I asked before, papers on chess where they say "we know that chess is a draw" (without coupling "know" witth "probalbly/likely", etc.)?

pfren

Even the solved Draughts-64 variation (the International Draughts have not been solved) is still played regularly, and a World Cup will be held in Tashkent, Uzbekistan in a few weeks.

The reason is pretty clear: No human being can ever memorize the solution, not even a chunk which could guarantee a high draw percentage. Actually the percentage of draws in ICCF events is way higher.

tygxc

#2789
Fermat's Last Theorem and the Four Color Theorem were also known to be true before these were proven. 'Provability is a higher degree of truth" that is a phrase I got from an article in Scientific American.
Fischer said "chess is a draw". Kramnik, Adorjan, Spassky, Capablanca, Lasker, Steinitz all said the same thing with more words. Adding a few weasel words 'probably', 'likely' does not make it more true. It is based on centuries of competitive chess and millions of games and many lifetimes of analysis.
People who thought Losing Chess to be a draw did not spent nearly as much time and effort to Losing Chess as has been spent on Chess.

haiaku
tygxc wrote:

Fermat's Last Theorem and the Four Color Theorem were also known to be true before these were proven. 'Provability is a higher degree of truth" that is a phrase I got from an article in Scientific American.

Then post the references you used for both those statements, so we can examine them.

tygxc wrote:

Fischer said "chess is a draw". Kramnik, Adorjan, Spassky, Capablanca, Lasker, Steinitz all said the same thing with more words.

It is called "anecdotal evidence". It may be good for a court of law, not for mathematics.

tygxc wrote:

Adding a few weasel words 'probably', 'likely' does not make it more true.

Indeed, it makes it less true.

MARattigan
pfren wrote:
Elroch wrote:

Here's a legal position with 9.19 pawn advantage for black according to 99-ply Stockfish 14.1 analysis. White's drawing strategy is not difficult.

 

 

 

This is after half a second.

 

 

 

 

Stockfish is quite slower- it needs 1.3 seconds.

 

 

 

 

This means that your 99-ply Stockfish is most probably either drunk, or extremely poorly configured.

Are you using SF8 there? @tygxc apparently plans to use SF14 which can give significantly different numbers.

Be that as it may, what does it give in this similar example?

White to play, pc=0


 

pfren
MARattigan wrote:
pfren wrote:
Elroch wrote:

Here's a legal position with 9.19 pawn advantage for black according to 99-ply Stockfish 14.1 analysis. White's drawing strategy is not difficult.

 

 

 

This is after half a second.

 

 

 

 

Stockfish is quite slower- it needs 1.3 seconds.

 

 

 

 

This means that your 99-ply Stockfish is most probably either drunk, or extremely poorly configured.

Are you using SF8 there? @tygxc apparently plans to use SF14 which can give significantly different numbers.

Be that as it may, what does it give in this similar example?

White to play, pc=0


 

 

I use a recent devel version from github, much newer than SF 14.

Crystal is the latest build (from last December) and has nnue enabled.

I do not expect any of these engines to "solve" the above any time soon, as no antifortress code can handle this- white's king has quite a few similarly "bad" choices, while in the first example white's moves are absolutely forced.

Elroch
pfren wrote:
Elroch wrote:

Here's a legal position with 9.19 pawn advantage for black according to 99-ply Stockfish 14.1 analysis. White's drawing strategy is not difficult.

 

 

 

This is after half a second.

 

 

 

 

Stockfish is quite slower- it needs 1.3 seconds.

 

 

 

 

This means that your 99-ply Stockfish is most probably either drunk, or extremely poorly configured.

I used the chess.com online analysis tool  on maximum depth analysis - try it yourself.  chess.com configured it, not me.  I was using the default configuration of Stockfish which (surprisingly) is not NNUE. I thought that has been the only version in recent years.

It would be interesting to determine the reason for any discrepancy.

Here is a screen grab of an earlier stage of a similar use of the analysis tool. Note that the number -2.71 is a lot less than the -9.xx in the earlier analysis -  after hours of analysis. This number only appeared later in the analysis. Don't ask me why.

 

Chess.com's Stockfish 14.1 NNUE gives a lower evaluation initially, but again it might change later: