Positions engines get wrong ( please contribute ) - Chess Forums - Page 3

watcha · 2013-11-17T22:32:16-08:00

I'm interested in positions the evaluation of which is clear to any human observer but one or more leading chess engines ( the likes of Houdini, Stockfish, Rybka ) completely get them wrong being blind to the obvious. Have you ever encountered such a position? If you did please sumbit. The posi

watcha

Nov 25, 2013

0

#41

Xilmi wrote:

As I said before: A stable positive evaluation by an engine does not mean that the position is winning. It means "this is most likely drawn by 50 move-rule". Houdini 3 even has an option in it's settings that makes the engine believe the 50-move-rule is a 10-move-rule for example. If you use this option to analyze the positions in question it will label all of them as 0.00 relatively quickly.

Games like the Shirov-Game where the engine dismisses the winning move, on the other hand, are a nice example of the circumstance that engines still are very far away from perfect play. In this case their pruning makes them overlook the winning move. But let's be honest: How many human players would find such a move without massive in-depth analysis?

You are making a good point: in realistic positions it is very unlikely that a side is winning without making captures and pawn moves with a frequency significantly more than [ 1 capture or pawn move ] / 50 moves. So high value positions which zero out under the '10 move rule' must be suspicious.

But consider the following position:

This position is a win in 30 which makes it a forced win under the 50 move rule.

Houdini values this position unchanging cca. +6 which is a winning evaluation:

Under the '10 move rule' Houdini zeros out:

LoekBergman

Nov 26, 2013

0

#42

In the thread about the match between Anand and Carlsen has fabelhaft posted this combination of Magnus Carlsen:

http://www.chessgames.com/perl/chessgame?gid=1268980

He told that even Houdini did not see the final combination.

The_Cosmologist

Nov 26, 2013

0

#43

"When the computer sees forced moves, it plays like God"-Garry Kasparov

Xilmi

Nov 26, 2013

0

#44

@watcha: Reducing the 50 move-rule in the config of Houdini of course is not a good idea. Houdini should be able to find the mate in 30 when there's only 4 pieces left as this should limit the search-tree far enough to look deep enough. It doesn't give any benefit to it's playing strength anyways. It simply makes it call a drawn position drawn quicker at the expense of more possible "false positives" in that regard.

The forced mate in 261 is particularly funny. Only tablebases will "know" that... and engines that use tablebases. But then again it doesn't work anyways because of the 50 move-rule.

madhacker

Nov 26, 2013

0

#45

Shirov's Bh3 is an excellent move, but I don't understand why Topalov took the bait. Would he not be still in the game after simply g3 in response?

ThrillerFan

Nov 26, 2013

0

#46

I see a whole array of openings and endgames that no computer seems to get right.

Computers think the King's Indian, Grunfeld, and Benoni are all at least three quarters of a pawn better for White by move 8 or so.

Endings are always a joke. R+N vs R, no pawns, they'll say is +3. Another good example of one that computers always think is winning is White has a Queen, Black has a Rook and Pawn. The pawn protects the Rook. The King is guarding the pawn. White's King is on the wrong side of the Rook (for example, if the Black Rook is on d5, White's King is on the 1st thru 4th ranks, not the 6th thru 8th ranks). Computers always think White's winning. Any human ought to know that KQ vs KRP with the K, R, and P all united and the WK on the wrong side of the Rook is an easy draw as the Black Rook merely toggles on the two squares that the pawn protects except when White checks with the Queen.

IM chesskingdreamer

Nov 26, 2013

0

#47

madhacker wrote:

Shirov's Bh3 is an excellent move, but I don't understand why Topalov took the bait. Would he not be still in the game after simply g3 in response?

The idea was to get a tempo to play Kf5....e4 so the white king must defend g2 or play g3. If bb1 for example then kf2 to ke3 defends in to

TheGreatOogieBoogie

Nov 27, 2013

0

#48

A computer said this was a draw. After annotating a certain game I flipped on the engine and was kind of confused when it said draw when I calculated a win for white:

The_Cosmologist

Dec 8, 2013

0

#49

Till now we had been looking for positions, now I have a game .

I found this game from this thread: http://www.chess.com/forum/view/general/houdini-4-released-but-the-arb-chess-system-still-crushes-it

The person claims to beat today's strongest chess engines by a specific opening(or rather gameplay).

In this game he claims to have beaten Houdini 4 (a new release which is a lot stronger than Houdini 3), but I doubt about it, as this game was played 4 months ago while it has been just a few weeks since Houdini 4 was released.

I haven't analysed this game with any engine but it would be interesting to know the result if someone analyses it.

Otomun

Dec 8, 2013

0

#50

TheGreatOogieBoogie, what engine are you using?

Toadofsky

Dec 8, 2013

0

#51

watcha wrote:

freaky25 wrote:

6k1/5n2/8/8/8/5n2/1RK5/1N6 w - - 0 1

White to move, mate in 262. I doubt any engine can see that deep.

All engines find the first move 1. Kd3 but none the second 2. Kd4. They play 2. Nc3 or 2. Nd2.

Wow, that's an interesting endgame. It's refreshing to see your post with evidence that engines pick the wrong moves (not just misevaluate the position but also throw away the win).

watcha

Dec 9, 2013

0

#52

The_Cosmologist wrote:

Till now we had been looking for positions, now I have a game .

I found this game from this thread: http://www.chess.com/forum/view/general/houdini-4-released-but-the-arb-chess-system-still-crushes-it

The person claims to beat today's strongest chess engines by a specific opening(or rather gameplay).

In this game he claims to have beaten Houdini 4 (a new release which is a lot stronger than Houdini 3), but I doubt about it, as this game was played 4 months ago while it has been just a few weeks since Houdini 4 was released.

I haven't analysed this game with any engine but it would be interesting to know the result if someone analyses it.

This game has been analyzed at length in the thread http://www.chess.com/forum/view/general/houdini-chess-program-crushed---using-the-arb-chess-system.

I have checked this game and it seems authentic. Given the time control ( 1 minute / 1 move ) it is very plausible that Houdini could have missed the winning (sacrifice) move on move 19. In single PV mode I have tested this move several times starting Houdini with empty hash and on many occasions finding the correct move took significantly longer than 1 minute if it was found at all. In multi PV mode the correct move was found much quicker.

Finding sacrifical moves is not easy for engines ( there are many examples of this on these forums ). Theoretically you can always raise the multi PV limit to the point that all legal moves are getting analyzed in which case the merits of the sacrifice move may be quickly found. However this is impracticable since it decreases the overall strength of the engine dramatically.

TheGreatOogieBoogie

Dec 9, 2013

0

#53

Jeez I can never get enough of posting this position:

Maybe sacrificing some king safety for chances to get rid of the darksquared bishop (both his most important defender and attacker) and restricting c5 wasn't worth it? If I played h3-Bh5-g4-Bg6-Bxg6-fxg6 I'd have a clear advantage but tunnel visioned on restricting c5.

watcha

Dec 9, 2013

0

#54

TheGreatOogieBoogie wrote:

Jeez I can never get enough of posting this position:

Maybe sacrificing some king safety for chances to get rid of the darksquared bishop (both his most important defender and attacker) and restricting c5 wasn't worth it? If I played h3-Bh5-g4-Bg6-Bxg6-fxg6 I'd have a clear advantage but tunnel visioned on restricting c5.

Both Stockfish (+0.5) and Houdini (+0.35) see this position slightly better for white ( assuming that it is white's turn - which is not indicated in the diagram ). In what way is this position considered an engine failure?

TheGreatOogieBoogie

Dec 9, 2013

0

#55

watcha wrote:

TheGreatOogieBoogie wrote:

Jeez I can never get enough of posting this position:

Maybe sacrificing some king safety for chances to get rid of the darksquared bishop (both his most important defender and attacker) and restricting c5 wasn't worth it? If I played h3-Bh5-g4-Bg6-Bxg6-fxg6 I'd have a clear advantage but tunnel visioned on restricting c5.

Both Stockfish (+0.5) and Houdini (+0.35) see this position slightly better for white ( assuming that it is white's turn - which is not indicated in the diagram ). In what way is this position considered an engine failure?

Maybe we have different processors or different versions? I set Stockfish to 95 value for space (overrated since practically cramped positions contain latent resources. I say "practically" because players are looking for their best move so wouldn't accept a cramped position without compensation or a punching out plan) 95 pawn structure (middlegame) and 105 activity and bishop pair so maybe that swayed the eval?

The_Cosmologist

Dec 14, 2013

0

#56

I've got a game which shows how differently programmed chess engines are.

Houdini 3 found out after 37 seconds that Nc7 was the best move.After 45 secs. it found that it was actually the winning move.

Rybka 3(not 4) after 18 secs. gave Nc7 top priority and after 1:44 realised that it was actually the winning one.

Bad luck was with Stockfish 4 64 bit SSE4.2 who even after 13 minutes couldn't realise that Nc7 was the best.I stopped it after that.

I hope someone would make stockfish find the truth even if it takes half an hour(or more).If it isn't able to find the winning move then it would be a real failure for Stockfish programmers.

NOTE:For all the tests I used the single PV mode ONLY.

watcha

Dec 14, 2013

0

#57

The_Cosmologist wrote:

I've got a game which shows how differently programmed chess engines are.

Houdini 3 found out after 37 seconds that Nc7 was the best move.After 45 secs. it found that it was actually the winning move.

Rybka 3(not 4) after 18 secs. gave Nc7 top priority and after 1:44 realised that it was actually the winning one.

Bad luck was with Stockfish 4 64 bit SSE4.2 who even after 13 minutes couldn't realise that Nc7 was the best.I stopped it after that.

I hope someone would make stockfish find the truth even if it takes half an hour(or more).If it isn't able to find the winning move then it would be a real failure for Stockfish programmers.

NOTE:For all the tests I used the single PV mode ONLY.

For me Houdini found out 24. ... Nc7 in 57 secs and immediately as a winning move. Rybka found it in 17 secs and made it a winning move after 1 min 6 secs. I finished analyzing with Stockfish after 3 mins 53 secs without finding the move. These were all in single PV mode. So basically I can confirm your results.

I'm always surprised what positions the big Rybka 4 book contains. Once I have run into a mate puzzle in one of the threads the solution of which was a recommended move in the Rybka 4 book. Believe ot or not your position is also represented in the Rybka 4 book with 24. ... Nc7 as a green ( recommended ) move:

The_Cosmologist

Dec 15, 2013

0

#58

Are you saying that Rybka 4 opening book contains the information?

watcha

Dec 15, 2013

0

#59

The_Cosmologist wrote:

Are you saying that Rybka 4 opening book contains the information?

Xilmi

Dec 15, 2013

0

#60

Stockfish is known to be pruning a lot more aggressively in order to go much deeper on what it thinks are good continuations.

The downside of this is, as we've just seen here, that it often won't find winning sacrifices.