Studies to Troll your engines with

Sort:
drdos7
MARattigan wrote:
drdos7 wrote:
MARattigan wrote:
drdos7 wrote:

ShashChess shows a win for white in the Otto Blathy after 27 seconds:

Interesting. I don't think I've ever had a program that could.

But we've had several other examples where it depends not only on the engine but also on the GUI and configuration.

It also depends on the allotted think time, but on a different thread several examples were shown where increasing the think time increased the blunder rate. I think that could be fairly prevalent in "difficult" positions.

I'm playing your endgame position against the syzygy right now with the latest development version of Stockfish WITH NNUE and sans tablebases. So far it is playing all of the best moves.

I just noticed my example was also SF15 with NNUE not SF15.1 without.

It marked time on move 3 but otherwise was perfectly accurate up to move 22 when it started to get flaky (even though its evaluation collapsed on move 19).

I used Stockfish 15 NNUE without tablebases on the Otto Blathy puzzle you posted, and Stockfish annouced a mate in 50 after 8 minutes and 16 seconds and held onto it until I terminated the search at the 9 minute and 20 second mark.

MARattigan

@drdos7

I'm surprised. I thought that position was a certainty for your "Mates that are difficult for engines" thread.

My last post was wrong on two counts.

Firstly, I just tried it with my standard version of 15.1 without NNUE and it also announced mate (after a fairly long wait and was oscillating at values just over 50). So I did probably have an engine that will play it, though I haven't tried it yet.

Secondly, 1...Kg1 is just as obstinate as 1...Ke1. I'd somehow assumed it wasn't because the bishop disappears immediately, but they both reach the same position(s) where the cycle that forces the pawns forward starts on move 7.

So SF15 may have solved the position depending on whether it's telling the truth or not.

Edit: Make that 3 counts. 1...Kg1 is actually better than 1...Ke1.

MARattigan

@drdos7

Tried it.

Not as accurate as yours, but mate.

Who was Unknown player?

drdos7
MARattigan wrote:

@drdos7

Tried it.

Not as accurate as yours, but mate.

Who was Unknown player?

There wasn't another player in mine, I let it run in infinite (or some call it analysis mode) mode until it announced mate. My computer was running 20 cores with a 8GB hash table.

MARattigan

Wow, is that a Lamborghini? Mine has 4 and if I run two chess engines with 4GB hash at once the system dies.

drdos7
MARattigan wrote:

Wow, is that a Lamborghini? Mine has 4 and if I run two chess engines with 4GB hash at once the system dies.

Actually by today's standards while it's not slow, it's not really that fast. I bought this workstation back in 2016 (it was made in 2015) used for about $2200 (just the tower without any hard drives). Here is the passmark for it I ran back in 2019:

Today you can get a faster computer for under $1000 U.S.

drdos7

Position #17

Here is a nice endgame study that engines seem to have a lot of trouble with

The engine troll factor on this one is about 9.5, meaning that perhaps 1 or maybe 2 engines out of several hundred might solve this one.

Arisktotle

The last one is an excellent endgame study indeed! It is very technical which makes it harder for experienced study solvers who are accustomed to making some spectacular choices. Pleasing to see that engines struggle on it as well - though the solution is not that long! Btw, the 1. ... d3 variation would never be included in a study solution as it is too boring to be even called "technical" and has countless duals. Engines reflect that sentiment by assigning it +3 scores right from the start!

drdos7
Arisktotle wrote:

The last one is an excellent endgame study indeed! It is very technical which makes it harder for experienced study solvers who are accustomed to making some spectacular choices. Pleasing to see that engines struggle on it as well - though the solution is not that long! Btw, the 1. ... d3 variation would never be included in a study solution as it is too boring to be even called "technical" and has countless duals. Engines reflect that sentiment by assigning it +3 scores right from the start!

I realize that about the 1...d3 variation, however some of my engines like to play that move after 1.Nf5!!, so I had to include it to demonstrate what would happen if that was played because this is about "trolling engines". Almost ALL of the engines I use see more than +3 scores after 1.Nf5!! is played.

Arisktotle
drdos7 wrote:

I realize that about the 1...d3 variation, however some of my engines like to play that move after 1.Nf5!!, so I had to include it to demonstrate what would happen if that was played because this is about "trolling engines". Almost ALL of the engines I use see more than +3 scores after 1.Nf5!! is played.

If that is true then you can't say that the engine is trolled. The +3 score is typical for an ending like K+B+N vs K and the engine makes an excellent job of the analysis when it comes up with that (winning) score! wink

drdos7
Arisktotle wrote:
drdos7 wrote:

I realize that about the 1...d3 variation, however some of my engines like to play that move after 1.Nf5!!, so I had to include it to demonstrate what would happen if that was played because this is about "trolling engines". Almost ALL of the engines I use see more than +3 scores after 1.Nf5!! is played.

If that is true then you can't say that the engine is trolled. The +3 score is typical for an ending like K+B+N vs K and the engine makes an excellent job of the analysis when it comes up with that (winning) score!

I think you already knew that the engine is trolled because it can't find 1.Nf5!!, I just think you are trying to "troll" me wink.

Arisktotle
drdos7 wrote:

I think you already knew that the engine is trolled because it can't find 1.Nf5!!, I just think you are trying to "troll" me .

In fact, I didn't know! The difference of 1 move does not seem that significant to me on the subject of trolling. But weirdly enough SF16 appears to have changed behaviour compared to a few days ago. Suddenly the +3 scores don't come up that quickly anymore. I guess they disabled the quantum support processor wink

Arisktotle
DesperateKingWalk wrote:

I think you guys need to install TABLEBASES.....

Thanks for the analysis! I only have the software available offered to me by chess.com. I don't know if it switches to tablebases when the position reduces to 7 men. However, I am sure that drdos7 does not depend on chess.com. He has everything installed and configured locally!

SF and Komodo appear to miss the main candidate for the defense which is 1. .. Ng5+ and chose the boring response 1. ... d3 instead. Which shows that these general purpose engines are incapable of recognizing the presence of a composed endgame study with different priorities. Kudos to lc0 for getting it right!

EndgameEnthusiast2357

Stockfish solves this one instantly, but this site's browser based version says it's a draw:

drdos7
DesperateKingWalk wrote:

The current Stockfish solved the position also,

Line 0.0
3kB3/5K2/7p/3p4/3pn3/4NN2/8/1b4B1 w - - 0 1

Analysis by Stockfish dev-20230729-65ece7d9:

1.Nf5 d3 2.Bb6+ Kc8 3.Ba5 Kb7 4.Ke6 h5 5.Ke5 Bc2 6.Bb4 Bd1 7.N3d4 d2 8.Bc6+ Kb6 9.Bxd5 Ba4 10.Ne3 d1Q 11.Nxd1 Bxd1 12.Kxe4 Ba4 13.Ke3 Bd7 14.Be7 Ka5
 Depth: 40/27 00:00:40 278MN, tb=4092597
 White is winning.

(, 08.08.2023)

Hi Mark, glad to have you back in one of my threads, and indeed you are correct. I enabled the tablebases on the latest development version of Stockfish and it did find the correct first move and continuation after about an 8 minute search on my computer (but it did find the correct FIRST move after 4 minutes 43 seconds), but you have to keep in mind that most of the people here on chess.com don't have a locally installed version of the engines we have nor do they have anywhere near the hardware we have either (even though our hardware would hardly be considered top end by today's standards). The fact is that most people here don't know how to use the engines to get the most benefit out of them.

Have you tried any of the other positions in this thread?

drdos7
AmishQuilt wrote:

Get off the g d engine

HEY! You can't say that! grinevil

Arisktotle
EndgameEnthusiast2357 wrote:

Stockfish solves this one instantly, but this site's browser based version says it's a draw:

Yes, it appears to say that but something is awfully misprogrammed here - either in chess.com or in SF or in both. Just play a bit with the number of display lines, like change it to 5. Instead you can also play 1.h8=N which changes nothing but if you also give 1. .. g3 later (the only legal move!!!!) it will give you the mate straight away. Other things might help as well like changing the engine to Komodo and then back or setting the number of lines to 1 (but not straight away).

The point is here that - as I said - some moron programmers are involved here and what they did listens to no logic. What effectively happened is that the engine is frozen into some initial lines - all ending in stalemate - and refusing to continue in other move options (1. h8=N is not in the initial 3) so you have to give it a 5000V jolt! Btw, I suspect SF already found the mate but refuses to include it in the display lines.

Don't believe anybody trying to sell you anything else regarding initial moves or lines or engine methods. It's all nonsense. These are immense programming errors and somebody should be shot for them! I know. I was the best tester in the house while I was still employed.

EndgameEnthusiast2357

Well it's certainly not a horizon issue, as this forced mate in 17 or 19 move endgame it solves after analyzing for a minute:

Arisktotle
EndgameEnthusiast2357 wrote:

Well it's certainly not a horizon issue, as this forced mate in 17 or 19 move endgame it solves after analyzing for a minute:

It's mad programmers, believe me! Or better, test it as I prescribed.

Btw, that applies to your diagram and probably many "similars" which appear simple. Not to the drdos7 diagrams which I haven't tested for this behaviour!

MARattigan
EndgameEnthusiast2357 wrote:

Well it's certainly not a horizon issue, as this forced mate in 17 or 19 move endgame it solves after analyzing for a minute:

SF will play any mate in KNNKP of depth <= 33-38 moves (depending on SF version) with the pawn on the 6th. or 7th. ranks perfectly (if not necessarily accurately) with just a few seconds think time. 39 it can't do with any reasonable think time.

It takes about 40 minutes on my system before it starts announcing mate in 38 (SF 15.1 onwards) but you don't seem to be able to rely on such announcements corresponding to a complete analysis.