Chess will never be solved, here's why

Sort:
Avatar of Elroch
tygxc wrote:

@11945

"Karpov and Kasparov"
++ were not as strong as engines, and engines are not as strong as ICCF WC finalist + engines

Totally inadequate reasoning. You could have said the same about engine of 10 years ago which would get demolished by those today - indeed I am sure that if there had been a lot of draws between them you WOULD have said that. You would have had no realization that future engines will be strong then, just as you do not know.

And we KNOW Stockfish is far from perfect. It sometimes makes 6 blunders in a row in tablebase positions. What do you think it does in really difficult positions with many more pieces? At the moment we have nothing to check it with, except in tablebase positions.

and they played 3 minutes/move, not 5 days/move

"sequences of 17 and 14 draws" ++ Here we have no sequence of 17 or 14, but 112.
Statistics on 112 are stronger than on 17.

Statistics are uncertain. Always. This includes when the empirical data is from a single class. It most certainly does NOT indicate all the data will be of that class. It is merely strong (but uncertain) evidence that most of it will be.

"If you had seen that sequence of 17 draws"
++ But in the whole 1984-1985 match there were 8 decisive games and 40 draws.
If all ongoing 24 games would be decisive, then we need to reconsider.

You seem not to understand that the cube is a small number is a small number, not zero. And the same for higher powers.

"The correct answer is don't know"
++ That is the answer to everything by an agnostic. Will the Sun rise tomorrow? don't know!

Deductive proofs are certain. 

Avatar of Elroch
Kotshmot wrote:
tygxc wrote:

@11945

"Karpov and Kasparov"
++ were not as strong as engines, and engines are not as strong as ICCF WC finalist + engines
and they played 3 minutes/move, not 5 days/move

"sequences of 17 and 14 draws" ++ Here we have no sequence of 17 or 14, but 112.
Statistics on 112 are stronger than on 17.

"If you had seen that sequence of 17 draws"
++ But in the whole 1984-1985 match there were 8 decisive games and 40 draws.
If all ongoing 24 games would be decisive, then we need to reconsider.

"The correct answer is don't know"
++ That is the answer to everything by an agnostic. Will the Sun rise tomorrow? don't know!

A chance certainly exists for the event that sun wouldn't rise tomorrow, alltho considering current scientific evidence and our past experience it's pretty low. Probability for those 112 games not being perfect is quite abit higher. Not sure what this comparison really showed us.

Yes, it would be incorrect to reject the possibility of say a high speed object from outside the solar system destroying the Earth before sunrise tomorrow. Such objects exist, for sure, and we don't have any knowledge of where smaller examples are. The probability of being hit by one in a day is thankfully small.

Avatar of tygxc

@11950
"Probability for those 112 games not being perfect is quite abit higher." 
++ That is not precise. What probability do you mean?

'What will the weather be tomorrow?' "The correct answer is don't know."
It is also the most useless.

A true mathematician would say: I have gathered the data of 112 weather stations, on my 2 servers solved the Navier-Stokes equations and arrived at:
clouded, but dry, 20°C, weak wind from northeast. That is useful to farmers, pilots, tourists...

Avatar of tygxc

@11952

"You could have said the same about engine of 10 years ago"
++ 10 years ago the ICCF WC finals had 35 decisive games, now none.

"And we KNOW Stockfish is far from perfect." ++ At blitz speed.

"It sometimes makes 6 blunders in a row in tablebase positions."
++ In irrelevant positions and at blitz speed.

"What do you think it does in really difficult positions with many more pieces?"
++ With 5 days per move it can hold the draw.

"Statistics are uncertain"
++ Tromp used statistics to arrive at (4.82 +- 0.03) * 10^44 legal chess positions.

Avatar of tygxc

@11956

Weather forecaster: "The data suggests it will rain" 
General Eisenhower: "So it will rain?"
Weather forecaster:
"I don't know whether it will rain, but based on that data it's more likely than not."
General Eisenhower: "Not good enough. I need to know to proceed with D-day or not. Thousands of lives depend on it. Be more precise!"

"a perfect game is a draw" ++ Yes.

"Everyone already understands this" ++ There are still a few who believe otherwise.

"confuse top level ICCF games for perfect games simply because they're high level draws"
++ Not because they are high level draws, but because all 112 out of 112 of the strongest chess on the planet (5 days/move, ICCF WC finalist + 2 servers 90*10^6 positions/s) now end in draws.

"what differentiated imperfect ICCF draws from perfectly played ICCF draws"
++ We cannot know by inspecting the games: the most skilled inspectors: ICCF WC finalists with twin servers 90*10^6 positions/s at average 5 days/move could not find any imperfection.

If you want to talk probability, then assume game 113 is decisive, and a clean win, no clerical error, or hospitalised player. That would leave a probability of 1/113 for a single error, or a probability of (1/113)² = 8*10^-5 of a double error, or a probability of 99.92% for the 112 draws to be perfect games with optimal play from both sides.

Avatar of MEGACHE3SE

"A true mathematician would say: I have gathered the data of 112 weather stations, on my 2 servers solved the Navier-Stokes equations and arrived at:
clouded, but dry, 20°C, weak wind from northeast. That is useful to farmers, pilots, tourists"

thats why a mathematician isnt a weather scientist. a mathematician deals with perfect deduction, not estimates.

tygxc, ive had my statements on mathematical rigor literally verified BY mathematicians.

they all chewed me out for wasting my time as someone as illogical as you.

why arent you addressing this fact?

Avatar of MEGACHE3SE

"If you want to talk probability, then assume game 113 is decisive, and a clean win, no clerical error, or hospitalised player. That would leave a probability of 1/113 for a single error, or a probability of (1/113)² = 8*10^-5 of a double error, or a probability of 99.92% for the 112 draws to be perfect games with optimal play from both sides."

if the 112 draws are proof then its proof regardless of any other game outcome. thats what basic deduction means.

Avatar of MEGACHE3SE

'"Statistics are uncertain"
++ Tromp used statistics to arrive at (4.82 +- 0.03) * 10^44 legal chess positions."

uhhh did you notice how tromp said it was an estimate with a probability of being wrong?

in fact the ± quantity is literally based on the standard deviation of his wrongness.

also dont think that everyone hasnt noticed how you failed to address even a single of llama's posts.

Avatar of MEGACHE3SE

lets put this into perspective here:

tygxc, in response to someone asking for mathematical rigor, responded with the claim that a mathematician working at a weather station wouldnt deal in absolutes and would just give the estimate of the weather.

he really thinks that math is just a bunch of guesses that we think are accurate. LMFAO

Avatar of MaetsNori

Correct me if I'm wrong, but having all draws in the ICCF WC isn't a solution to chess.

A solution to chess would be knowing the exact number of moves (and the precise sequence of moves) to a win/loss/draw from any given position, no?

IOW: a 32-man tablebase ...

Avatar of Kotshmot
tygxc wrote:

If you want to talk probability, then assume game 113 is decisive, and a clean win, no clerical error, or hospitalised player. That would leave a probability of 1/113 for a single error, or a probability of (1/113)² = 8*10^-5 of a double error, or a probability of 99.92% for the 112 draws to be perfect games with optimal play from both sides.

I don't buy these probability calculations at all. Once again you assume all errors are similar in likelihood. In reality, errors are not similar at all and they're not independant events either, meaning first error occurring could lead to the second error being more likely than the first one.

After an error there can be a 100 winning lines or there can be 1. In the second case another error is far more likely to follow. It would also make sense logically, that in the starting position there are more drawing lines, than there would be winning lines after a subtle error by a strong engine. This makes it way more difficult to estimate how many errors are there actually in these games or whats the likelyhood of these games being perfect.

Avatar of DiogenesDue
MaetsNori wrote:

Correct me if I'm wrong, but having all draws in the ICCF WC isn't a solution to chess.

You're definitely not wrong.

Avatar of BigChessplayer665
llama_l wrote:
tygxc wrote:

Not because they are high level draws, but because the strongest chess on the planet now end in draws.

"Strongest chess on a planet" AKA stockfish on my PC... why would I be surprised my engine draws itself?

... seriously, I just looked at a few ICCF games... my PC liked every move (most moves were the #1 pick, a few were not #1 whenever moves were equivalent).

But ok, let's check with something reproducible, I'll have lichess analyze it.

Michel Lecroq vs Jon Edwards from here:
https://www.iccf.com/event?id=85042

-

-

[Event "WC32/final"]
[Site "ICCF"]
[Date "2020.06.20"]
[Round "?"]
[White "Lecroq, Michel"]
[Black "Edwards, Jon"]
[Result "1/2-1/2"]
[ECO "B33"]
[WhiteElo "2568"]
[BlackElo "2525"]
[PlyCount "114"]
[EventDate "2020.??.??"]

1. e4 c5 2. Nf3 Nc6 3. d4 cxd4 4. Nxd4 Nf6 5. Nc3 e5 6. Ndb5 d6 7. Nd5 Nxd5 8.
exd5 Nb8 9. a4 Be7 10. Bd2 O-O 11. a5 Nd7 12. Be2 f5 13. O-O a6 14. Na3 f4 15.
Nc4 Rf6 16. Bb4 Rh6 17. Re1 Bf8 18. Ra3 Qg5 19. Bf1 Nf6 20. Bxd6 e4 21. Bxf8
Bg4 22. Qd4 Rxf8 23. Rxe4 Nxe4 24. Qxe4 Bf5 25. Qd4 Bxc2 26. Rc3 Bf5 27. d6 Qf6
28. Qd5+ Qf7 29. Qd2 Rd8 30. h3 Rf6 31. Be2 Kh8 32. Bf3 Qf8 33. Bxb7 Be6 34.
Bd5 Bd7 35. b3 Bb5 36. Rd3 h6 37. Kh2 f3 38. Bxf3 Bxc4 39. bxc4 Rdxd6 40. Rxd6
Qxd6+ 41. Qxd6 Rxd6 42. Bd5 Rf6 43. Kg3 g5 44. f3 Kg7 45. Kf2 Kf8 46. Ke3 Ke7
47. Kd4 Kd7 48. g3 Rf8 49. Bb7 Kc7 50. Be4 Kd7 51. Ke5 Rb8 52. f4 gxf4 53. gxf4
Rb3 54. f5 Ra3 55. c5 Rxa5 56. Kd5 Ke7 57. f6+ Kxf6 1/2-1/2

-

Wow, look at that, a 10 second analysis agrees with every move... meaning the humans added absolutely nothing.

ICCF is a joke. I wouldn't lose a single game either. Maybe 10 years ago that wouldn't be true, but I have no reason to believe it's not true today.

I did the same thing stockfish drew itself every game

Avatar of BigChessplayer665
llama_l wrote:

Not only Stockfish vs itself, but any modern engine vs engine match is almost entirely draws.

For example, this link shows Stockfish, look at the "results" column to see its performance against other engines. You'll see stats like 30 draws out of 32 games.

https://computerchess.org.uk/ccrl/4040/cgi/engine_details.cgi?print=Details&each_game=0&eng=Stockfish%2020230613%2064-bit%204CPU#Stockfish_20230613_64-bit_4CPU

And the fact that it doesn't think ahead in endgames and doesn't understand when opposite colored bishop endgames are winning sometimes

Avatar of DiogenesDue
BigChessplayer665 wrote:

I did the same thing stockfish drew itself every game

When the dev team wants to make new release, how do they test it?

- Does the new engine beat the old engine over 50% of the time?

- Does the new engine beat other popular engines more often than the old engine did?

- Does the new engine draw itself consistently?

These are the characteristics that produce a TCEC winner. TCEC winning engines and their handful of close competitors survive, all other engines go on the scrap heap.

The very process of "evolution" for computer engines ensures that they are tuned heavily towards their own current playing ability, with no regard to the future, only the past, and certainly no thought towards perfect play or solving chess. They are designed to win most often against other engines, but more importantly to not lose against other engines. An engine that does not draw itself the vast majority of the time is an engine that is vulnerable to losses.

Now, add the other very important factor here:

Stockfish is the king of the heap, but is also open source. Stockfish's competitors can always steal from it in their own development (indirectly if not openly lifting code line for line if the other engine is a commercial one), and if Stockfish does not draw itself at at even higher rate than most engines, then the opposing dev teams can figure out why, and improve their own engines to exploit blind spots in Stockfish's playing. The *only* advantage that Stockfish maintains is the privacy of their current beta version whose source has not been released yet. So, each release *must* change in some way, even if mostly a lateral change in playing ability, to stay ahead of the open source issue.

This creates a neverending cycle where engines play only the same handful of opponents and are tuned to (1) never lose, first priority, and (2) to have as big a shot at winning as they can manage without compromising (1).

So, is it a surprise that engines draw more and more frequently? Is it a surprise that they draw amongst each other? Not at all. Engines have evolved into an incestuous little band of siblings, and this affects their possibilities of approaching "perfect play" at lot more than people think.

Avatar of DiogenesDue
llama_l wrote:

Several good points in here.

Yes, if a new release plays 40% of positions worse than the previous version, then the dev team considers that a success because it's playing 60% of them better so it's a net gain in rating. They've said so themselves in interviews (paraphrased here of course since most positions are played the same).

For this and other reasons you mention, it's very much incorrect to assume top engines play perfect chess, or are even moving towards it.

Consider the following scenario:

Take 6 top players...Carlsen, Caruana, Nakamura, Nepo, Gukesh, Pragg. Now remove them from FIDE and have them play each other, and effectively only each other, for 20 years. Give them a cutthroat win dynamic...top player gets rich, 2 gets a living wage, 3/4/5 get bread and water, 6 lives in solitary confinement. They have a powerful incentive to learn to claw their way higher by any means, and learn to play each other very, very, well.

Now...what is your confidence level that they remain the top players in the world upon re-integration with FIDE at the end of the experiment? What is your confidence level that their play does not ultimately suffer from the severely reduced playing pool?

This is where the top engines are right now.

Avatar of moxnix22

I mean assume we had a table base for all 32 pieces and assume its a draw considering the new eval would be draw or win in any given position with no numbers outsdie forced mate isn't a perfect game any game where it never swings to a loss/win? That in mind they could be playing perfect games already in slow time controls with good hardware. In fact I would imagine that's the most likely . So until we get future tech and have that 32 piece table base its not proven but I would assume many perfect games have been played as the best guess have now is it starts a draw and you need to make mistakes to swap from draw to lose and technically anything inside those bounds is perfect. Can make an engine as strong as you want still only 20 moves turn 1 and I personally cant see a future where in 3024 some guy finished the table bases as goes aha sorry the correct move was 1e4 white has forced mate in 408 moves sorry d4 was always a draw.

Avatar of Thee_Ghostess_Lola

Ty's smart. very smart. theyre looking at stuff in practice. like 2. Ba6. i mean come on...4real ?? whos dum enuf to say this isnt a completely lost position (w/out compensation !) from move 2 ??...but then it must be some ppl here needta do a delineation. and i say fine. if it makes ur proof gawz happy. but being dn a full bishop THAT early is NOT like saying a lottery tix wont win.

so quit wimping on Ty cuz hes taking the practical side. hes making excellent points that are most likely gonna stand any test. they just needta be confirmed w/a final silicon go-thru. AND U KNOW IT !

Avatar of Elroch

Interesting fact, of the n-piece tablebases, the 27 and 28 piece ones are the biggest (source: Peter Österlund). I am not sure which of the two is the absolute biggest.

Avatar of Elroch
Thee_Ghostess_Lola wrote:

Ty's smart. very smart. theyre looking at stuff in practice. like 2. Ba6. i mean come on...4real ?? whos dum enuf to say this isnt a completely lost position (w/out comensation !) from move 2 ??...but then it must be some ppl here needta do a delineation. and i say fine. if it makes ur proof gawz happy. but being dn a full bishop THAT early is NOT like saying a lottery tix wont win.

so quit wimping on Ty cuz hes taking the practical side. hes making excellent points that are most likely gonna stand any test. they just needta be confirmed w/a final silicon go-thru. AND U KNOW IT !

Guessing is easy, and everyone is free to do it. @tygxc just uses all the wrong terms for his guesses.