Chess will never be solved, here's why

Sort:
Avatar of MARattigan
Elroch wrote:

Statistically, the claim that a win of a pawn wins a game is not only unsupportable, it is dubious whether the win of a pawn suffices to make the expectation of the game greater than 75% (which would be the expectation if the probability of a win was the same as the probability of a draw.

Strong evidence for this comes from Stockfish evaluations collated with results and with neural network expections.  My impression is that a 1 pawn advantage gives an expectation of about 0.7. Note that a 1 pawn advantage from Stockfish is what you have if you are 1 pawn up but it evaluates the positional factors to be balanced (positional factors adjust the material balance indicated in the evaluation).

Certainly this needs to be checked empirically in a more systematic way.

How would such a check tell you anything about perfect play?

If you try a statistical check on the result of KQKNN positions, SF14 with NNUE or human, will tell you they're generally drawn under basic rules, but Nalimov says they're 98% won by the queen.

I think all you get from looking at practical play is information about practical play.

Avatar of Elroch
tygxc wrote:

#2766
"My impression is that a 1 pawn advantage gives an expectation of about 0.7"
Expectation is a notion of imperfect play, linked to probability of error.

No, probability of result. There is no guarantee that there is merely one error. If errors can occur, multiple errors surely can too, for either or both sides.
A position is either a draw, a win, or a loss.

There are no intermediates.

It's nice that some of your statements are true. The above is an example.

I feel you have entirely missed a crucial fact. Suppose it is the case that a position is evaluated +1 pawns by a super-duper computer (and let's assume one side is actually a pawn up), Then suppose if you play two super-duper computers against each other in this position (or a set of such positions), the score is 70%, most of the results being draws and almost all the rest being wins for the side with the extra pawn.

You claim the mixed results are due to inaccuracy. This is correct from a game theoretic point of view if there are mixed results for a single position. But you also claim the true result is a win. This is not only unjustifiable, it is also likely to be wrong a lot of the time. If the true result was a win, why does the winning side make so many blunders to give away the draw, while the side with a pawn less makes fewer cancelling blunders? 

You can be very confident that a lot of +1.00 positions are draws. The evidence is that it is most of them, but regardless of this the notion that all of them are wins is absurd.

Avatar of MARattigan
SylvesterPSmythe wrote:
...

No, not infinitely variable. That's why we have the 50 move rule.

We don't have the 50 move rule any more under basic rules, but even when we did it wasn't compulsory to claim.

We do have a limit on games governed by competition rules since 2017. There is an automatic draw if the ply count reaches 150 or if positions with the same player to move, pieces of the same kind and colour occupying the same squares and the possible moves of all the pieces of both players the same occur five times. 

Avatar of MARattigan
Elroch wrote:
...

You can be very confident that a lot of +1.00 positions are draws. The evidence is that it is most of them, but regardless of this the notion that all of them are wins is absurd.

Evidence?

Avatar of Elroch

First, I acknowledge that the actual results of games between imperfect players from positions don't tell you what the tablebase value is. But I conjecture that for a lot of the sort of positions where chess experience is most relevant, they are likely to be some sort of indication. People lose dead lost openings a lot. Top evenly matched players probably draw theoretically drawn positions a lot. Not proof, but seems most likely (like the opening position being a draw is a reasonable conjecture based on white's 54% score in master play (or whatever it is).

Leela has learnt to estimate the expected result in positions reached in games and an informal observation is that a pawn according to Stockfish is somewhere around a 70% expectation.

A random example is not so far from that:

 
Stockfish evaluation: -1.23 pawns (depth 32 ply)
Leelazero evaluation: 24.3% (3 million nodes), i.e. 65.7% for black

 

Avatar of Elroch

[This post replied to a now deleted post]

I am not and I acknowledge that you are right to question this. It's based on a loose notion of randomness of the errors producing results. This notion can be quantified.

To a tablebase you can think of there only being 3 classes of error, all of which change the value. There are of course no errors for a losing player, one class of error for a player in a drawing position, and two classes of errors for a player in a winning position.  With some reasonable assumptions about the statistics of these errors, you will find it implausible that a class of losing positions usually end up as draws. (The statistics needed to achieve this would be that errors turning a win into a draw are much more common than those turning a draw into a loss. If the frequencies of different types of errors are comparable, you would find that if you have enough errors to turn most losses into draws, you would have quite a few wins for the other side too). Of course, this also provides a loose argument that the large number of draws in top level games does indeed mean the value of chess is a draw. Elsewhere you will find me pointing out that this argument leaves uncertainty so doesn't tell us the value for sure. I am acknowledging this uncertainty here as well.

Avatar of playerafar
Elroch wrote:

Here's a legal position with 9.19 pawn advantage for black according to 99-ply Stockfish 14.1 analysis. White's drawing strategy is not difficult.

 

 

How could Stockfish assign an advantage of +9 there?
Chess diagrams don't copy - but in the quoted post the White king simply keeps black's King immobile - which paralyzes black's material advantage of two rooks and a bishop. 

The strongest engines are stronger than top GM's these days.
But for solving projects and chess research would engines be used that make that kind of mistake of failing to factor position versus material ?

If engines are still that Primitive (hard to believe) - and that persists - then Indeed chess will never be solved on this planet.
If aliens come from this galaxy or from another (or from another Big bang) who have a 'solution' ...  then programmers' reaction to the aliens' algorithm:  could be what?
What could they offer the alien?
Alien:  "Don't worry.  You've got more troubles with global warmings and too many nukes and other problems.  If we offered you the chess algorithms - that's not going to motivate you to do anything about those other things."

So - another thing coming out of the discussion -
that Stockfish could be So Inaccurate ...
@Elroch 's contribution to be added to that of @MARattigan 's about the analysis button indicating a win where white can draw.

Avatar of Elroch

The reason Stockfish is not good at evaluating anomalous positions like that is that no-one has prioritised it in Stockfish development. Playing better in any position that could arise in a game without the result being clear is worth investing time in for the main purpose of being a strong player.

Avatar of Elroch

Regarding probabilistic evaluation and the exciting feature of Leela that probabilities of all 3 results can be predicted (and how that changes the view of a game as it progresses), here is a very interesting blog post:

https://lczero.org/blog/2020/04/wdl-head/

Avatar of playerafar
Elroch wrote:

Regarding probabilistic evaluation and the exciting development in Leela where probabilities of all 3 results can now be predicted (and how that changes the view of a game as it progresses), here is a very interesting blog post:

https://lczero.org/blog/2020/04/wdl-head/

I clicked on that.  Read some of it.
'Probability' versus material imbalance ...
I would say todays chess computers do factor in some position into their evals.
The idea of advantage of +1 corresponding to one pawn up or material equivalent is basically flawed right away?  Probably. 
but its apparently factored in because people 'can relate to that.'

Noting in the article that AlphaZero used 'probability'.
There was quite a stir when AlphaZero first came out.

I think people can relate to odds and probability -
but many chessplayers may not like that though.
And many players prefer to see chess as removed from luck and money odds.
But for chess research projects - why not use probability?
In the current eval schemes - like on Stockfish -
what would be the difference between +30 and +50?

Avatar of pfren
Elroch wrote:

Here's a legal position with 9.19 pawn advantage for black according to 99-ply Stockfish 14.1 analysis. White's drawing strategy is not difficult.

 

 

 

This is after half a second.

 

 

 

Stockfish is quite slower- it needs 1.3 seconds.

 

 

 

This means that your 99-ply Stockfish is most probably either drunk, or extremely poorly configured.

Avatar of Elroch

The reason evaluations on traditional engines are in centipawns is because they started with the standard piece values then added on a load of fudge factors for positional factors. It's a horrible mess because it is possible and there was no obvious alternative that was practical.

The reason evaluations on neural networks are in probabilities is because the rules of chess say the aim is to win and that a draw is of intermediate value (usually half a win).  It's just the obvious way to do it, and closely related to what is done in the most common and best developed branches of machine learning.

Avatar of playerafar

When AlphaZero first came out - it crushed !
Apparently not only building on probability instead of material values but also organizing its calculations better.
That's one of the marks of a better player.
Which calculations to do early. 
To do that right - observation needs to be distinguished from calculation.

Avatar of Optimissed
playerafar wrote:
Elroch wrote:

Regarding probabilistic evaluation and the exciting development in Leela where probabilities of all 3 results can now be predicted (and how that changes the view of a game as it progresses), here is a very interesting blog post:

https://lczero.org/blog/2020/04/wdl-head/

I clicked on that.  Read some of it.
'Probability' versus material imbalance ...
I would say todays chess computers do factor in some position into their evals.
The idea of advantage of +1 corresponding to one pawn up or material equivalent is basically flawed right away?  Probably. 
but its apparently factored in because people 'can relate to that.'

Noting in the article that AlphaZero used 'probability'.
There was quite a stir when AlphaZero first came out.

I think people can relate to odds and probability -
but many chessplayers may not like that though.
And many players prefer to see chess as removed from luck and money odds.
But for chess research projects - why not use probability?
In the current eval schemes - like on Stockfish -
what would be the difference between +30 and +50?

I think icecream in Heaven is a metablast.
But is it ... let us think again.
How to evaluate it.
There was quite a stir when it was first made.
Everyone thinking "me next please" and writing
In black verse.


Avatar of playerafar

Looks like a postaround there.
That's one of the good things about @tygxc - and the people responding to him or otherwise posting - they are neither deterred nor baited.

Avatar of tygxc

#2776

"To a tablebase you can think of there only being 3 classes of error, all of which change the value. There are of course no errors for a losing player, one class of error for a player in a drawing position, and two classes of errors for a player in a winning position."
++ Yes, that is all correct

"With some reasonable assumptions about the statistics of these errors, you will find it implausible that a class of losing positions usually end up as draws. (The statistics needed to achieve this would be that errors turning a win into a draw are much more common than those turning a draw into a loss."
++ Yes, indeed, that is right.
Even more, when the probability of error is low, say 1 error in 100 games, like in ICCF WC draws, then a slight difference in the error rate say 0.65 errors in 100 games from draw to loss and 0.35 errors in 100 games from won to draw cannot explain the observed data.
Also as the initial position is a draw, an error from won to draw requires a previous error from draw to loss. So errors from draw to won can occur alone, or together with an error from won to draw or a blunder from won to lost.
My take is that error rate from draw to won = error rate from won to draw,
and blunder rate from won to lost = (error rate from draw to lost)².
If the error rate is low like 1 error in 100 games, like in ICCF WC draws, then this is right in itself. Even if it is not entirely right, then the result is correct if the error rate is low enough.

"If the frequencies of different types of errors are comparable, you would find that if you have enough errors to turn most losses into draws, you would have quite a few wins for the other side too)."
++ Yes, that is right.
That is what I based my calculation of the error rate E from the decisive rate D on.
E = Sqrt(1 + 1 / (2*D)²) - 1 / (2*D)

"Of course, this also provides a loose argument that the large number of draws in top level games does indeed mean the value of chess is a draw."
++ This is indeed one of the arguments for the game-theoretic value of chess being a draw.
It is not only the large number of draws, but also that the draw rate rises each year in ICCF WC games. ICCF games allow table base win claims that exceed 50 moves, so ICCF is a more decisive game than standard chess.
Another argument is that the draw rate goes up with more time/move in AlphaZero autoplay to approach all draws. Even if the rules are changed to stalemate = win, the same holds true for the resulting more decisive game.
The results cannot be attributed to random fluctuations, as calculations for 4 unsound openings (Chigorin Defence, Dutch Defence, Alekhine Defence, King's Gambit) show.
Another argument is that TCEC approaches all draws, so they had to impose slightly unbalanced openings to avoid all draws. Even with the slightly unbalanced openings the draw rate is still high. This is remiscent of checkers, where in human competition they imposed openings considered slightly unbalanced to avoid all draws, long before checkers was proven a draw. So checkers was known to be a draw before it was proven.

Avatar of haiaku
tygxc wrote:

This is remiscent of checkers, where in human competition they imposed openings considered slightly unbalanced to avoid all draws, long before checkers was proven a draw. So checkers was known to be a draw before it was proven.

Your unscrupulous use of the word "known" might be applied to your beloved antichess too, where 1. e3 was "known" to be a draw, and has been proven to be a win.

Avatar of tygxc

#2787
"1. e3 was "known" to be a draw, and proven to be a win"
++ Losing Chess was not played or analysed as intensively as Checkers or Chess.
"many people (including chess grandmasters who had made some study of the game) told me that they had thought that 1. e3 would only draw, particularly against c5 or b6"
"my whole involvement in the project was a sort of “gamble” that White would win in the end,
as else there would be little chance of completion"
So many people thought it a draw,
but at least one gambled that white wins and he solved the game.
http://magma.maths.usyd.edu.au/~watkins/LOSING_CHESS/ICGA2016.pdf 

Avatar of haiaku

Indeed, because what for others is "thought", for you is "known". Can you provide, as I asked before, papers on chess where they say "we know that chess is a draw" (without coupling "know" witth "probalbly/likely", etc.)?

Avatar of pfren

Even the solved Draughts-64 variation (the International Draughts have not been solved) is still played regularly, and a World Cup will be held in Tashkent, Uzbekistan in a few weeks.

The reason is pretty clear: No human being can ever memorize the solution, not even a chunk which could guarantee a high draw percentage. Actually the percentage of draws in ICCF events is way higher.