Positions engines get wrong ( please contribute )

Sort:
chessdex

Here's a position. Computer gives white is winning by a lot, but it's a draw.



watcha
Xilmi wrote:

Stockfish is known to be pruning a lot more aggressively in order to go much deeper on what it thinks are good continuations.

The downside of this is, as we've just seen here, that it often won't find winning sacrifices.

In an average chess position there are 35 legal moves. If you want to apply the minimax/negamax algorythm to the tree of chess you need the following time as a function of depth ( measured in plies ):

    Minimax/negamax analysis time ( secs )
Depth ( plies ) Nodes Laptop 3500 k-Nodes/sec Deep Blue 200 M-Nodes/sec
1 35 0 0
2 1 225 0 0
3 42 875 0 0
4 1 500 625 0 0
5 52 521 875 15 0
6 1 838 265 625 525 9
7 64 339 296 875 18 383 322
8 2 251 875 390 625 643 393 11 259

 

On my laptop Houdini 3 runs at an analysis speed of 3500 kN/s, Deep Blue analyzed at a speed of 200 MN/s which even by today's standards is a high speed since at TCEC, the site where top engines compete for the unofficial word championship title running on high end commercially available hardware the typical analysis speed is around 50 MN/s.

It can be seen from the table that the realistic ( that is: which can be achieved in a couple of minutes ) depth that can be reached by minimax/negamax is 6 plies on a laptop and 7 plies on Deep Blue.

Minimax/negamax analyzes every single position which can arise in a given number of plies therefore all positions will have correct value within the tree. However this is not necessary since you can apply a trick. If a move is refuted minimax/negamax will still search on to find out whether it can be refuted even more. However you can stop the search of the refuted move immediately without examining further the possible responses to this move. In this way you can cut branches of the tree. You can do the same thing at every level recursively. This algorythm is called alphabeta. With correct ordering of the moves ( using evaluations of shallow searches for ordering in deeper searches ) enough cuts can be made to reach twice the depth by alphabeta that can be achieved by minimax/negamax. Since alphabeta does not lose relevant information there is no risk that you cut a meaningful branch of the tree. Alphabeta will always find the same best move which would have been found by minimax/negamax.

By applying alphabeta you can reach a search depth of 12 plies on a laptop and 14 plies on Deep Blue. This means that by today's standards a perfect search can only be performed to a depth of 12 - 14 plies depending on the hardware. Any search deeper than this runs the risk of cutting meaningful branches off the tree. Since today's engines can reach a depth of at least 20 plies in a couple of minutes it is evident that these searches can not be perfect. Any move which yields visible results beyond 12 - 14 plies can fall victim to pruning.

rooperi

I just found this study by Troitzky. White to play and draw. Black can never make progress, White always has a stalemate defense. Stockfish tries different queen manuevres trying to avoid reptition, and evaluates as -5.72 right up to (almost) the end



watcha

This position is of no particular theoretical importance. White is obviously winning here.

This is how Houdini views this position, it values it at +23 and wants to move the queen:

Now look at this position from a human point of view. It is clearly visible that the black queen is jailed to defending the mate. It is also clear the if the white queen or rook moves the white king can be subject to checks. So a human would clearly conclude that the safe way is to push the f-pawn slowly up the board: it is untouchable, and at some point will be in a position to take part in the attack.

When Houdini is asked about this move it immediately switches its evaluation to a forced mate in 9:

AndrejPro
This position comes from the end of an interesting puzzle. Insert this setup into your favorite engine. The position is a dead draw, but engines will give black something around -14. White only needs to move the bishop along the a7-g1 diagonal and there is nothing black can do to win. 
wickiwacky

AndrejPro - I disagree - that position is a win for black. The Queeen goes to a1 - e5 - e1 and then f2 where white is forced to accept the sac and thus loses. 

chessfan999



wickiwacky
pfren wrote:
wickiwacky wrote:

AndrejPro - I disagree - that position is a win for black. The Queeen goes to a1 - e5 - e1 and then f2 where white is forced to accept the sac and thus loses. 

When this happens, white takes the queen, and after ...gxf2 he plays the only legal move, g2-g3+. Stalemate, regardless if Black captures, or plays ...Kh3.

Crikey yes! Missed the stalemate at the end - the f2 pawn just ready to Queen and cant do it. 

lofina_eidel_ismail
AndrejPro wrote:
 

truly interesting piece!

thx for sharing this

Toire
pfren wrote:

Here is a correspondence game I played (this means both players could use an engine). Black played 35...Rc6, which is the recommendation of all engines. An engine can calculate that white loses the pawn back if he exchanges rooks. What an engine cannot calculate within a reasonable timeframe is that the resulting king and pawn endgame (with equal pawns) is completely lost for Black (horizon effect), while an average human player will realize this after less than two minutes' thought.

 



So is there a move for Black which doesn't lose?

endomorphic

In this famous position, Stockfish recommends a3, while if d4 is played, black's much, much better, even according to Stockfish. I don't understand why it can't see that.

mkkuhner

In the famous game Short-Timman, White marches his king into Black's castled position and mates him in the middlegame.  When I try Stockfish on this game, it does not like the first two moves of the king march, and will in fact recommend moving the king *back* if you force it to play the first.   It's really interesting to see it claim to be looking 17+ moves ahead when it is missing a mate in 5.  After the second king move it suddenly sees the mate.

I haven't let it run for hours, admittedly, but I think it is looking further and further ahead, and has already ruled out the winning line so will probably never find it.

It is amazing that a human found this:  it's probably the most implausible combination I've ever seen.

endomorphic

@mkkuhner: That's a very nice one by Short. Indeed -- on move 34 Stockfish evaluates the position as equal and recommends the push of pawn to c3, after which the evaluation doesn't change. Instead, in the game Kf4 is played and Stockfish evaluates the position as +8.4. A similar scenario to the Topalov vs Shirov game.

lifeonvenus
I once tried using to engines .Making one vs the other one but later blundered like this
1.d4 1. ...e5
LeonelRitchie
Why castigate something that you know you don't stand a chance against, just because you see something it doesn't see.. If it's really stupid beat it at it strongest. Why am I saying this? Engines are not meant to compete against us neither are they to think for us. They only try to help us see the consequences of our immediate thinking and what could be better, for the sole purpose of learning.
lifeonvenus

yeah, I am learning by evaluating what the 2 computers did, but who knew it made a blunder.

MayCaesar

I remember a Stockfish vs Komodo game someone posted on some other forum. There Komodo thought its position was better, two moves before getting completely demolished - two moves later, its evaluation switched from ~+0.8 to -5 or something like that. The position also looked completely won by white, black was tightly squeezed, his king was in a huge danger, and there was no apparent reason for Komodo to think it was better. My guess is, Stockfish simply out-calculated Komodo, and Komodo saw the line it initially dismissed or didn't calculate far enough.

lcfb2003

Hello,

I have a question about all of these difficult positions: is there any pattern that could be identified? I am doing a doctoral course, and my thesis is about pattern recognition . As a general concept of a pattern we assume as being:  a subset of relationships (attacks/defenses, direct/indirect) between pieces plus a set of initial conditions plus a tactical move plan plus a set of post conditions (e.g. Philidor's mate). I created a language to represent such patterns that could be used to some way retrieve better/worst values when machine are evaluating such positions.

LoekBergman

@lcfb2003: there is not a particular pattern between those examples, but I think that there is a statistical relationship between them: they all occur very rare.

Take for instance this pearl from Nezhmethdinov:

 

Who would ever think of this queen sacrifice at move 12? It takes a genius to find it.

 

lcfb2003

@LoekBergman Thanks. Another question: in this example you posted, in terms of future scenario the player was aiming to reach, should it be explained some way as a pattern or a collection of features?