Chess will never be solved, here's why

Sort:
Avatar of playerafar
tygxc wrote:

#2599
"Again - 'table base' versus 'top four moves'."
++ It is not versus, it is top four moves until the table base is reached.

The way it was worded did not suggest that -

Again:
"The fact that you carry on to "calculate" that SF14 will find errors in its top four moves in only 1 in 100,000,000,000,000,000,000 positions and I've given you four already ought to tell you something." ++ You have calculated neither on a 10^9 nodes/s cloud engine nor for 60 hours/move. Your positions are not representative. Try KRPP vs. KRP and I predict even your desktop with short calculation time will find the table base correct move as its top 1 move."

I suggest you carefully qualify the use of the ++ there.

A claim of  one in a billion trillion error rate of your four moves - followed closely by a mention of a desktop and 'table base'.
Read it.

Suggestion 2:
Word the 'explanation' so that almost anyone just entering the conversation for the first time - would find it quite crystal clear.
Or qualify which is your text and which isn't ...   happy.png
Idea:  may as well do it right.

Avatar of tygxc

#2604
"LCZero considers around 1000x fewer nodes per second than Stockfish, but it applies a much stronger evaluation method. In other words, it prioritizes quality over quantity. And despite the difference in the algorithms of Stockfish and Lc0, they will show almost identical results in a given position. Note how Lc0 has reached 2x less depth and 1000x less NPS."
https://chessify.me/blog/nps-what-are-the-nodes-per-second-in-chess-engine-analysis 

I could not find the nodes/s used in the AlphaZero autoplay paper.
https://arxiv.org/pdf/2009.04374.pdf

Which I used figure 2 to derive 60 h/move for 1 error in 10^20 positions.

So presumably 60 h/move is too much: the same 1 error in 10^20 positions for the 4 top white moves can be reached with less time on a 10^9 nodes/s cloud engine. I have to think how much smaller it could be.

"Stockfish operates with the so-called "thin" nodes (little evaluation for a much bigger number of nodes), while Leela Chess Zero operates with "thick" nodes (better evaluation for a smaller number of nodes)." - from above reference.
To weakly solve chess I would prefer thin nodes of Stockfish. The aim is to calculate as deeply as possible so as to hit the 7-men endgame table base.

I also should revise the 60 h/move in a way that the number of nodes in the solution tree is about the same as the number of positions searched for each node, like it turned out in the solution of checkers, i.e. for each node of the solution tree about a tree of the same size is pruned away.

The main question remains the number of legal, sensible, reachable, and relevant positions.
That is the number of nanoseconds on the 10^9 nodes/s cloud engine, by whatever method.
legal 10^44, sensible 10^38 - 10^32, reachable more than the square root of sensible, relevant about 10% of reachable.

The most practical way would be to solve 1 tabiya of 1 ECO code e.g. C67.
That would give exactly how many positions had to be searched.

Avatar of tygxc

#2606

"it seemed to try to relate the two via somebody's desktop. "
++ The discussion was to test my prediction that the table base exact move is within the top 4 engine moves at 60 h/move on a 10^9 cloud engine and thus take a representative 7 men position like KRPP vs. KRP and verify that the exact table base move is within the top 4 engine moves on a desktop of say 10^6 nodes/s at say 1 h/move. If the table base exact move is within the top 4 engine moves at a desktop at 1 h/move, then it surely is the case on a more powerful 10^9 nodes/s cloud engine at more time e.g. 60 h/move.

Avatar of playerafar
tygxc wrote:

#2606

"it seemed to try to relate the two via somebody's desktop. "
++ The discussion was to test my prediction that the table base exact move is within the top 4 engine moves at 60 h/move on a 10^9 cloud engine and thus take a representative 7 men position like KRPP vs. KRP and verify that the exact table base move is within the top 4 engine moves on a desktop of say 10^6 nodes/s at say 1 h/move. If the table base exact move is within the top 4 engine moves at a desktop at 1 h/move, then it surely is the case on a more powerful 10^9 nodes/s cloud engine at more time e.g. 60 h/move.

That doesn't look like it would follow.
Doesn't look like a justification of a claim of a one in a billion trillion error rate.

Avatar of playerafar

You're trying to compare the speeds of the two computers (in nodes - lets say that part is OK for now)  ...
and the time taken - and argue that the validity of the table base would justify a claim of one error per billion trillion on the 'four moves' ... ??
If that is in fact what you're doing - suggest you so state.

Avatar of Elroch
tygxc wrote:

#2606

"it seemed to try to relate the two via somebody's desktop. "
++ The discussion was to test my prediction that the table base exact move is within the top 4 engine moves at 60 h/move on a 10^9 cloud engine and thus take a representative 7 men position like KRPP vs. KRP and verify that the exact table base move is within the top 4 engine moves on a desktop of say 10^6 nodes/s at say 1 h/move. If the table base exact move is within the top 4 engine moves at a desktop at 1 h/move, then it surely is the case on a more powerful 10^9 nodes/s cloud engine at more time e.g. 60 h/move.

To establish your prediction to adequate statistical reliability will take checking it for at least all the moves in an appropriate tablebase. Being wrong once is adequate to make empirical reasoning wrong, so checking say a tenth of the tablebase would be unreliable.

How many positions are there in the tablebase?

I can't see the value in this particular unreliable guessing.  Haven't we already seen a tablebase position where Stockfish doesn't rank the only good move highly?

Avatar of playerafar

It seems a claim that accuracy and speed and time taken by the tablebase - where the tablebase is considering all possibilities ...
would justify a claim of a one in 10 to the 21st power error rate -
of 'four moves' in positions with far more pieces - where Also all moves and possibilities are Not considered Either ....

Avatar of tygxc

#2610
I extrapolated the 1 error per 10^20 positions where the optimal move is not within the top 4 engine moves at 60 h/move from figure 2 of the AlphaZero autoplay paper.
https://arxiv.org/pdf/2009.04374.pdf 

Some people dispute that and present artificially constructed positions where their desktop running for a few minutes does not have the table base exact move within its top 4 moves.
I counter that and say they should use a common 7-men position KRPP vs. KRP and predict even on a desktop running at 1 h/move the table base exact move will be within the top 4 engine moves. I verified it for one KRPP vs. KRP position and my desktop at 10^6 nodes/s even in a few minutes finds the table base exact move as its top 1 move.

We also know from the above AlphaZero autoplay paper that time * 60 gives error / 5.6.

Avatar of tygxc

#2611

"To establish your prediction to adequate statistical reliability will take checking it for at least all the moves in an appropriate tablebase."
++ No, not all. The aim of statistics is to draw conclusions about a whole set by measurements on a sufficiently large subset.

"Being wrong once is adequate to make empirical reasoning wrong, so checking say a tenth of the tablebase would be unreliable."
++ No, not at all. For example Tromp sampled only 10,000 of his 7728772977965919677164873487685453137329736522 possible positions,
found 538 of them legal and thus arrived at his 4.5 * 10^44 legal positions.
https://github.com/tromp/ChessPositionRanking 

"How many positions are there in the tablebase?"
There are 423,836,835,667,331 positions in the 7-men table base. Some of these are illegal, some are insensible, most are not reachable by a game with > 50% accuracy, many are irrelevant. I propose to look at KRPP vs. KRP only as it is most representative.

"Haven't we already seen a tablebase position where Stockfish doesn't rank the only good move highly?"
++ No, not really.
There are some KNN vs. KP, but Stockfish is known to mishandle that if not given enough time to calculate to the checkmate.
There has also been an artificially constructed KRPP vs. KRP, where a large number of bad shuffling has occured before, so that the correct move would trigger a 50-moves or a 3-fold repetition draw.
I still invite all to try and find one KRPP vs. KRP where the table base correct move is not the engine top move on a desktop with reasonable time.

Avatar of alexandermatrone-eleleth

I need some one to teach me

Avatar of alexandermatrone-eleleth

Plz I need some one to coach me for free I’m so bad 

 

Avatar of MARattigan
Elroch wrote:
tygxc wrote:

#2606

"it seemed to try to relate the two via somebody's desktop. "
++ The discussion was to test my prediction that the table base exact move is within the top 4 engine moves at 60 h/move on a 10^9 cloud engine and thus take a representative 7 men position like KRPP vs. KRP and verify that the exact table base move is within the top 4 engine moves on a desktop of say 10^6 nodes/s at say 1 h/move. If the table base exact move is within the top 4 engine moves at a desktop at 1 h/move, then it surely is the case on a more powerful 10^9 nodes/s cloud engine at more time e.g. 60 h/move.

To establish your prediction to adequate statistical reliability will take checking it for at least all the moves in an appropriate tablebase. Being wrong once is adequate to make empirical reasoning wrong, so checking say a tenth of the tablebase would be unreliable.

How many positions are there in the tablebase?

I can't see where the value is in this particular unreliable guessing.  Haven't we already seen a tablebase position where Stockfish doesn't rank the only good move highly?

I've posted four such positions myself. 

Two of them depend on whether the 50 move rule is in effect. @tygxc is of the opinion that the 50 move rule makes no difference (but reserves the right to change the ply count in any such positions posted for some reason) and is ambivalent about whether the 50 move rule will be included in the rules of the game he attempts to solve.

But in any case he rejects any examples that haven't been run for 60 hours on an engine that can generate 10^9 nodes per second. This protects his assertion that SF14 will find four errors in only 1 in 100,000,000,000,000,000,000 by rendering it not practicable to provide counter examples.

The real point is that if he plans to consider only SF14's top four moves whenever he does a takeback, nothing gets proved anyway - he needs to consider all alternative moves, so the question of what would be the frequency of SF14's top four moves being errors is irrelevant.

SF14's error rate is not irrelevant to the time it would take for a forward searching solution to complete. If it can quickly find forced mates, that could cut aeons off the time. Indeed if the starting position is actually a forced mate in 20 that everyone's overlooked a systematic forward search could solve chess in reasonable time if its error rate were low enough.

Avatar of tygxc

#2617
"I've posted four such positions myself" ++ No, none of these is valid.

"Two of them depend on whether the 50 move rule is in effect."
++ No, if the 50 moves rule is valid or not does not matter, but a position where the 50-moves rule is close to being triggered because of previous shuffling with bad moves is not relevant.

"@tygxc is of the opinion that the 50 move rule makes no difference"
++ Yes, that is right: the 50 moves rule is in practice never invoked with > 8 men.

"but reserves the right to change the ply count in any such positions posted"
++ Yes, that is right: let us discuss positions right after a capture or a pawn move,
so that the 50-moves counter and the 3-fold repetition counter are reset to 0.

"is ambivalent about whether the 50 move rule will be included in the rules of the game he attempts to solve"
++ No, I say the 50-moves rule does not matter as it is never invoked in positions of > 7 men.
I know no grandmaster or ICCF game where the 50-moves rule was invoked > 7 men.
Most grandmaster and ICCF games are already over before move 50: average is 39 moves.

"he rejects any examples that haven't been run for 60 hours on an engine that can generate 10^9 nodes per second"
++ No, I even predict that for a KRPP vs. KRP position with 50-moves and 3-fold repetition counters reset to 0 the engine top 1 move at 10^6 nodes/s and say 60 min/move to match the table base exact move. Try and find one counterexample.

"The real point is that if he plans to consider only SF14's top four moves whenever he does a takeback, nothing gets proved anyway"
++ If the 4 good moves cannot win a position, then the bad moves will not win it either.

"he needs to consider all alternatives"
++ No, if 1 e4, 1 d4, 1 c4, 1 Nf3 are all proven draws,
then 1 a4 will not win either and needs no consideration.

Avatar of Elroch
MARattigan wrote:

[snip]

But in any case [tyygxc] rejects any examples that haven't been run for 60 hours on an engine that can generate 10^9 nodes per second.

[snip]

It's worth emphasising that this is essentially the fallacy that if something hasn't been checked, you can assume it is true.

Avatar of tygxc

#2619
"if something hasn't been checked, you can assume it is true"
++ I arrived at 1 case per 10^20 positions where the exact move is not within the top 4 engine moves at 60 h/move by extrapolating from the AlphaZero autoplay paper figure 2.

You are free to check by other means, e.g. to compare the engine top move at lower time to the table base for KRPP vs. KRP. For the one position thus posted the top engine move on a desktop exactly matches the table base unless the position is twisted by previous bad moves that are close to triggering the 50 moves rule.

Avatar of Elroch

Extrapolation is entirely unreliable without a known relationship that is known to extend far enough. This is a general truth.

Avatar of tygxc

#2621
"Extrapolation is entirely unreliable without a known relationship that is known to extend far enough."
++ Yes, that is true.
However extraploating from 1 s/move and 1 min/move to 60 h/move seems feasible,
especially as no precise result is needed, only an order of magnitude.

On the other hand extrapolating from 7 men to 32 men is entirely unreliable,
that is why Haworth's law is no law at all
https://www.researchgate.net/publication/304271294_Haworth's_Law 

Avatar of MARattigan
tygxc wrote:

#2617
"I've posted four such positions myself" ++ No, none of these is valid.

"Two of them depend on whether the 50 move rule is in effect."
++ No, if the 50 moves rule is valid or not does not matter, but a position where the 50-moves rule is close to being triggered because of previous shuffling with bad moves is not relevant.

I repeat: Two of them depend on whether the 50 move rule is in effect.

The second and third positions I posted are drawn with the 50 move rule in effect but won under basic rules.

You seem to have a nonstandard meaning of the word "no" as well as "solve". How can you say the positions don't depend on whether the 50 move rule is in effect and then in the same sentence say you're not prepared to consider them because they're close to triggering the 50 move rule?

In all four positions I gave under basic rules (or the first and last under competition rules) there is only one move that is not an error. When you kick SF off again with the other three you will be shuffling with bad moves. If you plan to actually prove anything you have to kick SF off again with all the other moves.

If you're not doing a takeback you have to consider all possible opponent responses.

The result is your process will spend most of its time shuffling with bad moves.

How do you justify your assertion that the positions your process will actually reach are irrelevant?

"@tygxc is of the opinion that the 50 move rule makes no difference"
++ Yes, that is right: the 50 moves rule is in practice never invoked with > 8 men.

A solution needs to provide perfect play. What happens in practice is not relevant.

Practical players cannot usually prosecute mates of depth 50 or more unless their opponent is weaker or the mates have been analysed to that depth and they are conversant with the analysis.

This is already the case with five men on the board. It applies more as the number of men increases.

Some people don't believe such positions exist with 8 or more men on the board even when they're shown examples.

"but reserves the right to change the ply count in any such positions posted"
++ Yes, that is right: let us discuss positions right after a capture or a pawn move,
so that the 50-moves counter and the 3-fold repetition counter are reset to 0.

OK I gave you two of those, but they will obviously form a very small percentage of the positions in either your solution or your search for a solution.

Let us discuss the other positions too.

"is ambivalent about whether the 50 move rule will be included in the rules of the game he attempts to solve"
++ No, I say the 50-moves rule does not matter as it is never invoked in positions of > 7 men.

I.e. ambivalent about whether the 50 move rule will be included in the rules of the game he attempts to solve.
I know no grandmaster or ICCF game where the 50-moves rule was invoked > 7 men.
Most grandmaster and ICCF games are already over before move 50: average is 39 moves.

Your procedure won't be playing ICCF games; not even the same rules unless you've got another variation in your proposed game.

"he rejects any examples that haven't been run for 60 hours on an engine that can generate 10^9 nodes per second"
++ No, I even predict that for a KRPP vs. KRP position with 50-moves and 3-fold repetition counters reset to 0 the engine top 1 move at 10^6 nodes/s and say 60 min/move to match the table base exact move. Try and find one counterexample.

I already gave you two in different endgames. You can definitely assume your computation won't spend all its time in KRPP v KRP.

But if you're trying to prove a solution the question of SF14's top 4 moves is irrelevant.

So I've spent as much time as I'm going to on it. If you find the subject interesting look for your own examples. They're easy enough to find.

"The real point is that if he plans to consider only SF14's top four moves whenever he does a takeback, nothing gets proved anyway"
++ If the 4 good moves cannot win a position, then the bad moves will not win it either.

There need not be more than one good move and none of the good moves need be in SF14's list. You couldn't base a proof on that assumption even if I couldn't find counterexamples in a few minutes.

"he needs to consider all alternatives"
++ No, if 1 e4, 1 d4, 1 c4, 1 Nf3 are all proven draws,
then 1 a4 will not win either and needs no consideration.

So when Black looks up 1.a4 in your solution what will he see?

 

Avatar of haiaku
tygxc wrote:

I could not find the nodes/s used in the AlphaZero autoplay paper.
https://arxiv.org/pdf/2009.04374.pdf

https://arxiv.org/pdf/1712.01815.pdf , page 5 second paragraph. There, they say 80000  positions per second vs. 7 × 10⁷, in another paper 60000 vs. 6 × 10⁷, but the ratio is nonetheless roughly 1/1000, as for Lc0 compared to SF on the same hardware.

tygxc wrote:

I also should revise the 60 h/move in a way that the number of nodes in the solution tree is about the same as the number of positions searched for each node, like it turned out in the solution of checkers, i.e. for each node of the solution tree about a tree of the same size is pruned away.

The nodes were pruned after the search to save storage space. I do not remember if I have already posted this:

   "The stored proof tree is "only" 10⁷ positions. Saving the entire proof tree, from the start of the game so that every line ends in an endgame database position, would require many tens of terabytes, resources that were not available. Instead only the top of the proof tree, the information maintained by the manager, is stored on disk. When a user queries the proof, if the end of a line of play in the proof is reached, then the solver is used to continue the line into the database. This dramatically reduces the storage needs, at the cost of re-computing (roughly two minutes per search). [ . . . ]
How much computations was done in the proof? Roughly speaking, there are 10⁷ positions in the stored proof tree, each representing a search of 10⁷ positions [ . . . ]. Hence, 10¹⁴ is a good ballpark estimate of the forward search effort." ¹

tygxc wrote:

We also know from the above AlphaZero autoplay paper that time * 60 gives error / 5.6.

You used also time * 60 = 5.6 times less decisive games, and time * 60 = error / 5.6

I realized many posts ago that you do not use the above as real equalities. Instead, you mean that under your assumptions, given 60 times more time, the error rate is reduced by a factor 5.6. Your original identity was:

error = a / timeᵇ

From two points you can obtain its parameters, and therefore the error rate for any time. Please, for the sake of clarity do not use mathematical formulae lightly, if you aim to use later, for future calculations, the concept you are trying to express.

Anyway, imo you should use another formula, but before you can even start, you should prove that errors are really statistically independent as you say.

¹ https://www.researchgate.net/publication/231216842_Checkers_Is_Solved

Avatar of MARattigan

@haiaku

If by "errors are statistically independent" you mean that the probability of an error on a given move is constant that's obviously false. If you mean the probability of an error on a given move is independent of the probability on other moves that's also obviously false. (In each case the probability for AZ with a given think time).

How would the probability for AZ determine the probability for SF14? I think @tygxc plans to use SF14.