Studies to Troll your engines with

Sort:
drdos7

Position #16

The engine troll factor on this one is about 9.8, I found one engine that can solve this (Lc0) but no others.

White to play and win:

EndgameEnthusiast2357

Lots of interesting ways Queens can get trapped. I wonder what are engines take on this type of stuff:

Are there guaranteed ways to win a queen in such positions (not specifically this one but in general), or will the other side always manage to interfere with that in time?

MARattigan
drdos7 wrote:
MARattigan wrote:
drdos7 wrote:

Position #3

The engine troll factor on this one is about 9.9/10

White to play and win:

 

As you say. the black queen is a spectator. But White's queen's side pieces other than the king have to remain rooted to maintain that. Your only comment on the final position (shown) is "1-0", but how does White go about winning from there?

Here you go all roads lead to mate:

Sorry to be pedantic, but you've shown only that four roads of a very large number lead to mate.

I would guess that all do (notwithstanding that I managed to discover one that didn't at my first attempt). But that's different from proving that all do.

The SF mate declarations are not an infallible indication that there is a forced mate. SF will sometimes change it mind about that.

Of course that's a general problem with trying to find a solution to a position that's not tablebased using an engine. Even if the engine has genuinely found a forced mate it doesn't tell you what it is. It just gives you an incomplete selection of examples.

drdos7
MARattigan wrote:
drdos7 wrote:
MARattigan wrote:
drdos7 wrote:

Position #3

The engine troll factor on this one is about 9.9/10

White to play and win:

 

As you say. the black queen is a spectator. But White's queen's side pieces other than the king have to remain rooted to maintain that. Your only comment on the final position (shown) is "1-0", but how does White go about winning from there?

Here you go all roads lead to mate:

Sorry to be pedantic, but you've shown only that four roads of a very large number lead to mate.

I would guess that all do (notwithstanding that I managed to discover one that didn't at my first attempt). But that's different from proving that all do.

The SF mate declarations are not an infallible indication that there is a forced mate. SF will sometimes change it mind about that.

Of course that's a general problem with trying to find a solution to a position that's not tablebased using an engine. Even if the engine has genuinely found a forced mate it doesn't tell you what it is. It just gives you an incomplete selection of examples.

How about if you tell me where you think black could do better and I'll refute it. Doesn't your engine show a clearly winning score after any of the selected Black moves?

BTW, those mates ARE correct, if you vary from the lines given (on the black side) the mate is shorter or equal.

And if you would like to play it out against me from anywhere (including the beginning) in the pgn I can do that with me as White and you as Black I'll do that.

drdos7
MARattigan wrote:
drdos7 wrote:
MARattigan wrote:
drdos7 wrote:

Position #3

The engine troll factor on this one is about 9.9/10

White to play and win:

 

As you say. the black queen is a spectator. But White's queen's side pieces other than the king have to remain rooted to maintain that. Your only comment on the final position (shown) is "1-0", but how does White go about winning from there?

Here you go all roads lead to mate:

Sorry to be pedantic, but you've shown only that four roads of a very large number lead to mate.

I would guess that all do (notwithstanding that I managed to discover one that didn't at my first attempt). But that's different from proving that all do.

The SF mate declarations are not an infallible indication that there is a forced mate. SF will sometimes change it mind about that.

Of course that's a general problem with trying to find a solution to a position that's not tablebased using an engine. Even if the engine has genuinely found a forced mate it doesn't tell you what it is. It just gives you an incomplete selection of examples.

In case you were talking about earlier moves that might have been possible I finally found the original PGN after looking for a long time on my old computer.

 

Let me know if this isn't convincing enough

MARattigan

As I said, I would guess all roads lead to mate. I believe I could do the same with colours reversed.

I still maintain that you can't use an engine to prove that the final position is won unless it's in an an appropriate tablebase and the engine has access to the tablebase. After all look at your title.

What's a clearly winning score?

17.91?

or -0.16?

And when you say, "BTW, those mates ARE correct, if you vary from the lines given (on the black side) the mate is shorter or equal", how did you arrive at that. You can't rely on the mate announcements or the mate lengths from SF. (The latter are not actually relevant, but are probably less reliable than the former.)

Playing the game out of course proves nothing. There the relative strengths of the players is a significant factor.

It probably is practicable to show that the final position in your example is a white win, but I think the engines would be useful only in finding flaws in the human arguments that do that.

drdos7

This looks pretty convincing to me after the first 3 moves, and the score is still climbing. The reason you get -0.16 is because the engines don't understand the root position, but after you make a few moves they finally start to see the light.

MARattigan
drdos7 wrote:
MARattigan wrote:

As I said, I would guess all roads lead to mate. I believe I could do the same with colours reversed.

I still maintain that you can't use an engine to prove that the final position is won unless it's in an an appropriate tablebase and the engine has access to the tablebase. After all look at your title.

What's a clearly winning score?

17.91?

or -0.16?

And when you say, "BTW, those mates ARE correct, if you vary from the lines given (on the black side) the mate is shorter or equal", how did you arrive at that. You can't rely on the mate announcements or the mate lengths from SF. (The latter are not actually relevant, but are probably less reliable than the former.)

Playing the game out of course proves nothing. There the relative strengths of the players is a significant factor.

It probably is practicable to show that the final position in your example is a white win, but I think the engines would be useful only in finding flaws in the human arguments that do that.

If that is your standard then nothing is winning until it reaches a tablebase position if I understand you correctly.

No, that's not correct.

I can give a human proof that the first position in #131 is draw. It's not tablebased. The engine will only tell you White has a score of 17.91 (whatever that is supposed to mean).

This is Otto Blathy position is winning, for example; also not tablebased. I can give you a human argument to prove that too, but SF will only tell you White has a score of 0.00.

When I was alive you didn't even have engines or tablebases and people still produced puzzles.

What you can say is I think it's difficult to give a satisfactory proof of the validity of many examples taken from games. (But as I said, I think it may be practicable for the position in question; a human proof that is.)

drdos7
MARattigan wrote:
drdos7 wrote:
MARattigan wrote:

As I said, I would guess all roads lead to mate. I believe I could do the same with colours reversed.

I still maintain that you can't use an engine to prove that the final position is won unless it's in an an appropriate tablebase and the engine has access to the tablebase. After all look at your title.

What's a clearly winning score?

17.91?

or -0.16?

And when you say, "BTW, those mates ARE correct, if you vary from the lines given (on the black side) the mate is shorter or equal", how did you arrive at that. You can't rely on the mate announcements or the mate lengths from SF. (The latter are not actually relevant, but are probably less reliable than the former.)

Playing the game out of course proves nothing. There the relative strengths of the players is a significant factor.

It probably is practicable to show that the final position in your example is a white win, but I think the engines would be useful only in finding flaws in the human arguments that do that.

If that is your standard then nothing is winning until it reaches a tablebase position if I understand you correctly.

No, that's not correct.

I can give a human proof that the first position in #131 is draw. It's not tablebased. The engine will only tell you White has a score of 17.91 (whatever that is supposed to mean).

This is Otto Blathy position is winning, for example; also not tablebased. I can give you a human argument to prove that too, but SF will only tell you White has a score of 0.00.

When I was alive you didn't even have engines or tablebases and people still produced puzzles.

What you can say is I think it's difficult to give a satisfactory proof of the validity of many examples taken from games. (But as I said, I think it may be practicable for the position in question; A human proof that is.)

Well, it looks to me like you are still alive grin, and I'm not young myself, I've been around as an adult before the engines and tablebases also, and People still do produce puzzles.

There are positions that engines don't understand and that is the point of this thread, however they eventually do understand the positions when you play it out to a certain point, and I'll have to respectfully disagree with you about the mate announcements for Stockfish because I let it think for a while to make sure it doesn't change it's mind, it might not have the shortest mate, but if it announces a mate and holds it for several ply then there is definitely a mate there

drdos7

ShashChess shows a win for white in the Otto Blathy after 27 seconds:

MARattigan
drdos7 wrote:

This looks pretty convincing to me after the first 3 moves, and the score is still climbing. The reason you get -0.16 is because the engines don't understand the root position, but after you make a few moves they finally start to see the light.

It ain't necessarily so.

Here is SF15.1 sans NNUE v Rybky/Nalimov, white mate in 60 within the 50 move rule. It's evaluation climbs steadily from 1.59 after move 1 to 11.26 after move 18, then drops to 0.00 after move 19. It's drawn by repetition on move 39. So it starts off promisingly but finally sees the dark.

 
MARattigan
drdos7 wrote:

ShashChess shows a win for white in the Otto Blathy after 27 seconds:

Interesting. I don't think I've ever had a program that could.

But we've had several other examples where it depends not only on the engine but also on the GUI and configuration.

It also depends on the allotted think time, but on a different thread several examples were shown where increasing the think time increased the blunder rate. I think that could be fairly prevalent in "difficult" positions.

drdos7
MARattigan wrote:
drdos7 wrote:

ShashChess shows a win for white in the Otto Blathy after 27 seconds:

Interesting. I don't think I've ever had a program that could.

But we've had several other examples where it depends not only on the engine but also on the GUI and configuration.

It also depends on the allotted think time, but on a different thread several examples were shown where increasing the think time increased the blunder rate. I think that could be fairly prevalent in "difficult" positions.

I'm playing your endgame position against the syzygy right now with the latest development version of Stockfish WITH NNUE and sans tablebases. So far it is playing all of the best moves.

MARattigan
drdos7 wrote:
...I'll have to respectfully disagree with you about the mate announcements for Stockfish because I let it think for a while to make sure it doesn't change it's mind, it might not have the shortest mate, but if it announces a mate and holds it for several ply then there is definitely a mate there

I have to say I haven't seen SF announce mate and change it's mind after more than two moves (I think). I wouldn't consider that as a proof that SF has definitely found a mate if the announcement continues to hold though.

I have seen cases where SF's evaluations actually deteriorate with increased think time. E.g. here. (User @cobra91 pointed out that I'd missed a couple of blunders under competition rules in both the 2 second and 32 second games, which would flatten the regression line somewhat, but it's still the case, for example, that SF's blunder rate at 37 minutes think time per move was about five times its blunder rate at 1 second think time per move.)

I think it's just invalid to take engine evaluations sans tablebase as proof of the theoretical outcome of a position. They're not designed for that.

MARattigan
drdos7 wrote:
MARattigan wrote:
drdos7 wrote:

ShashChess shows a win for white in the Otto Blathy after 27 seconds:

Interesting. I don't think I've ever had a program that could.

But we've had several other examples where it depends not only on the engine but also on the GUI and configuration.

It also depends on the allotted think time, but on a different thread several examples were shown where increasing the think time increased the blunder rate. I think that could be fairly prevalent in "difficult" positions.

I'm playing your endgame position against the syzygy right now with the latest development version of Stockfish WITH NNUE and sans tablebases. So far it is playing all of the best moves.

I just noticed my example was also SF15 with NNUE not SF15.1 without.

It marked time on move 3 but otherwise was perfectly accurate up to move 22 when it started to get flaky (even though its evaluation collapsed on move 19).

drdos7

The latest development version of Stockfish did solve the mate in 60 in the endgame position you posted without tablebases, it played the correct moves and then around move 14 or 15 it announced a mate in 48, and as the game move along it corrected the mate count to align with a mate in 60 from the root postition:

MARattigan

Which raises an interesting point.

Your version of SF15 played the ending perfectly accurately. The Syzygy tablebase didn't.

The only accurate move 47 for Black is 47...Qd6+ instead of which the Syzygy table plays 47...Qa2 slipping a move. The Syzygy table is not designed to play accurately, only perfectly.

So how come the game still finished on move 60? That's because I got the mate depth wrong. I should have said mate in 61.

At any rate the engines obviously get better, though not uniformly. I had a version of Rybka about 20 years ago that could play the KNNvKP positions I posted, on old kit, perfectly, if not accurately, while my relatively recent versions of SF failed. Also my version of SF12 plays that particular endgame worse than SF8 (and particularly SF11).

But the fact that they do get better means that even the best is very unlikely to be perfect (hence the topic).

Which is why they can't be used as proof of a position's theoretical outcome. The position just played is, after all, a very simple position in the grand scheme.

drdos7
MARattigan wrote:

Which raises an interesting point.

Your version of SF15 played the ending perfectly accurately. The Syzygy tablebase didn't.

The only accurate move 47 for Black is 47...Qd6+ instead of which the Syzygy table plays 47...Qa2 slipping a move. The Syzygy table is not designed to play accurately, only perfectly.

So how come the game still finished on move 60? That's because I got the mate depth wrong. I should have said mate in 61.

At any rate the engines obviously get better, though not uniformly. I had a version of Rybka about 20 years ago that could play the KNNvKP positions I posted, on old kit, perfectly, if not accurately, while my relatively recent versions of SF failed. Also my version of SF12 plays that particular endgame worse than SF8 (and particularly SF11).

But the fact that they do get better means that even the best is very unlikely to be perfect (hence the topic).

Which is why they can't be used as proof of a position's theoretical outcome. The position just played is, after all, a very simple position in the grand scheme.

Actually I plugged the position into the Syzygy tablebases after White's 47th move and I made the mistake here, because I mistakenly assumed the first move in the list of choices was the best move, here is a screenshot:

MARattigan

Yes, you can follow the highest DTM values for Black, they're taken from Nalimov tables designed to play accurately under basic rules, but not necessarily perfectly under competition rules. The DTZ values will play perfectly under both (so long as there are no repeated positions considered the same under the repetition rules if competition rules apply) but not necessarily accurately under either.

But a lot of the DTM values are missing since the Lomonosov tables got sabotaged.

MARattigan
drdos7 wrote:

ShashChess shows a win for white in the Otto Blathy after 27 seconds:

It doesn't show a win. That would be a mate value (and if it's like SF not necessarily then).

A score of +200 means it thinks there's a win.

And unless I came up with the wrong solution it should be considering 1...Ke1, not 1...Kg1.