Upgrade to Chess.com Premium!

Positions engines get wrong ( please contribute )


  • 9 months ago · Quote · #41

    LoekBergman

    In the thread about the match between Anand and Carlsen has fabelhaft posted this combination of Magnus Carlsen:

    http://www.chessgames.com/perl/chessgame?gid=1268980

    He told that even Houdini did not see the final combination.

  • 9 months ago · Quote · #42

    Xilmi

    @watcha: Reducing the 50 move-rule in the config of Houdini of course is not a good idea. Houdini should be able to find the mate in 30 when there's only 4 pieces left as this should limit the search-tree far enough to look deep enough. It doesn't give any benefit to it's playing strength anyways. It simply makes it call a drawn position drawn quicker at the expense of more possible "false positives" in that regard.

    The forced mate in 261 is particularly funny. Only tablebases will "know" that... and engines that use tablebases. But then again it doesn't work anyways because of the 50 move-rule.

  • 9 months ago · Quote · #43

    madhacker

    Shirov's Bh3 is an excellent move, but I don't understand why Topalov took the bait. Would he not be still in the game after simply g3 in response?

  • 9 months ago · Quote · #44

    ThrillerFan

    I see a whole array of openings and endgames that no computer seems to get right.

    Computers think the King's Indian, Grunfeld, and Benoni are all at least three quarters of a pawn better for White by move 8 or so.

    Endings are always a joke.  R+N vs R, no pawns, they'll say is +3.  Another good example of one that computers always think is winning is White has a Queen, Black has a Rook and Pawn.  The pawn protects the Rook.  The King is guarding the pawn.  White's King is on the wrong side of the Rook (for example, if the Black Rook is on d5, White's King is on the 1st thru 4th ranks, not the 6th thru 8th ranks).  Computers always think White's winning.  Any human ought to know that KQ vs KRP with the K, R, and P all united and the WK on the wrong side of the Rook is an easy draw as the Black Rook merely toggles on the two squares that the pawn protects except when White checks with the Queen.

  • 9 months ago · Quote · #46

    TheGreatOogieBoogie

    A computer said this was a draw.  After annotating a certain game I flipped on the engine and was kind of confused when it said draw when I calculated a win for white:



  • 9 months ago · Quote · #47

    Otomun

    TheGreatOogieBoogie, what engine are you using?

  • 9 months ago · Quote · #48

    DandyDanD

    watcha wrote:
    freaky25 wrote:

    6k1/5n2/8/8/8/5n2/1RK5/1N6 w - - 0 1

    White to move, mate in 262. I doubt any engine can see that deep.

    All engines find the first move 1. Kd3 but none the second 2. Kd4. They play 2. Nc3 or 2. Nd2.

     

     

    Wow, that's an interesting endgame.  It's refreshing to see your post with evidence that engines pick the wrong moves (not just misevaluate the position but also throw away the win).

  • 9 months ago · Quote · #49

    watcha

    The_Cosmologist wrote:

    Till now we had been looking for positions, now I have a game .

    I found this game from this thread: http://www.chess.com/forum/view/general/houdini-4-released-but-the-arb-chess-system-still-crushes-it

    The person claims to beat today's strongest chess engines by a specific opening(or rather gameplay).

    In this game he claims to have beaten Houdini 4 (a new release which is a lot stronger than Houdini 3), but I doubt about it, as this game was played 4 months ago while it has been just a few weeks since Houdini 4 was released.

    I haven't analysed this game with any engine but it would be interesting to know the result if someone analyses it.

    This game has been analyzed at length in the thread http://www.chess.com/forum/view/general/houdini-chess-program-crushed---using-the-arb-chess-system.

    I have checked this game and it seems authentic. Given the time control ( 1 minute / 1 move ) it is very plausible that Houdini could have missed the winning (sacrifice) move on move 19. In single PV mode I have tested this move several times starting Houdini with empty hash and on many occasions finding the correct move took significantly longer than 1 minute if it was found at all. In multi PV mode the correct move was found much quicker.

    Finding sacrifical moves is not easy for engines ( there are many examples of this on these forums ). Theoretically you can always raise the multi PV limit to the point that all legal moves are getting analyzed in which case the merits of the sacrifice move may be quickly found. However this is impracticable since it decreases the overall strength of the engine dramatically.

  • 9 months ago · Quote · #50

    TheGreatOogieBoogie

    Jeez I can never get enough of posting this position:


    Maybe sacrificing some king safety for chances to get rid of the darksquared bishop (both his most important defender and attacker) and restricting c5 wasn't worth it?  If I played h3-Bh5-g4-Bg6-Bxg6-fxg6 I'd have a clear advantage but tunnel visioned on restricting c5. 

  • 9 months ago · Quote · #51

    watcha

    TheGreatOogieBoogie wrote:

    Jeez I can never get enough of posting this position:

    Maybe sacrificing some king safety for chances to get rid of the darksquared bishop (both his most important defender and attacker) and restricting c5 wasn't worth it?  If I played h3-Bh5-g4-Bg6-Bxg6-fxg6 I'd have a clear advantage but tunnel visioned on restricting c5. 

    Both Stockfish (+0.5) and Houdini (+0.35) see this position slightly better for white ( assuming that it is white's turn - which is not indicated in the diagram ). In what way is this position considered an engine failure?

  • 9 months ago · Quote · #52

    TheGreatOogieBoogie

    watcha wrote:
    TheGreatOogieBoogie wrote:

    Jeez I can never get enough of posting this position:

    Maybe sacrificing some king safety for chances to get rid of the darksquared bishop (both his most important defender and attacker) and restricting c5 wasn't worth it?  If I played h3-Bh5-g4-Bg6-Bxg6-fxg6 I'd have a clear advantage but tunnel visioned on restricting c5. 

    Both Stockfish (+0.5) and Houdini (+0.35) see this position slightly better for white ( assuming that it is white's turn - which is not indicated in the diagram ). In what way is this position considered an engine failure?

    Maybe we have different processors or different versions?  I set Stockfish to 95 value for space (overrated since practically cramped positions contain latent resources.  I say "practically" because players are looking for their best move so wouldn't accept a cramped position without compensation or a punching out plan)  95 pawn structure (middlegame) and 105 activity and bishop pair so maybe that swayed the eval? 

  • 9 months ago · Quote · #53

    watcha

    The_Cosmologist wrote:

    I've got a game which shows how differently programmed chess engines are.

    Houdini 3 found out after 37 seconds that Nc7 was the best move.After 45 secs. it found that it was actually the winning move.

    Rybka 3(not 4) after 18 secs. gave Nc7 top priority and after 1:44 realised that it was actually the winning one.

    Bad luck was with Stockfish 4 64 bit SSE4.2 who even after 13 minutes couldn't realise that Nc7 was the best.I stopped it after that.

    I hope someone would make stockfish find the truth even if it takes half an hour(or more).If it isn't able to find the winning move then it would be a real failure for Stockfish programmers.

    NOTE:For all the tests I used the single PV mode ONLY.

    For me Houdini found out 24. ... Nc7 in 57 secs and immediately as a winning move. Rybka found it in 17 secs and made it a winning move after 1 min 6 secs. I finished analyzing with Stockfish after 3 mins 53 secs without finding the move. These were all in single PV mode. So basically I can confirm your results.

    I'm always surprised what positions the big Rybka 4 book contains. Once I have run into a mate puzzle in one of the threads the solution of which was a recommended move in the Rybka 4 book. Believe ot or not your position is also represented in the Rybka 4 book with 24. ... Nc7 as a green ( recommended ) move:

  • 9 months ago · Quote · #54

    watcha

    The_Cosmologist wrote:

    Are you saying that Rybka 4 opening book contains the information?

  • 9 months ago · Quote · #55

    Xilmi

    Stockfish is known to be pruning a lot more aggressively in order to go much deeper on what it thinks are good continuations.

    The downside of this is, as we've just seen here, that it often won't find winning sacrifices.

  • 9 months ago · Quote · #56

    adarsh678910

    houdini and shredder is the best

  • 9 months ago · Quote · #57

    chessdex

    Here's a position. Computer gives white is winning by a lot, but it's a draw.



  • 9 months ago · Quote · #58

    watcha

    Xilmi wrote:

    Stockfish is known to be pruning a lot more aggressively in order to go much deeper on what it thinks are good continuations.

    The downside of this is, as we've just seen here, that it often won't find winning sacrifices.

    In an average chess position there are 35 legal moves. If you want to apply the minimax/negamax algorythm to the tree of chess you need the following time as a function of depth ( measured in plies ):

        Minimax/negamax analysis time ( secs )
    Depth ( plies ) Nodes Laptop 3500 k-Nodes/sec Deep Blue 200 M-Nodes/sec
    1 35 0 0
    2 1 225 0 0
    3 42 875 0 0
    4 1 500 625 0 0
    5 52 521 875 15 0
    6 1 838 265 625 525 9
    7 64 339 296 875 18 383 322
    8 2 251 875 390 625 643 393 11 259

     

    On my laptop Houdini 3 runs at an analysis speed of 3500 kN/s, Deep Blue analyzed at a speed of 200 MN/s which even by today's standards is a high speed since at TCEC, the site where top engines compete for the unofficial word championship title running on high end commercially available hardware the typical analysis speed is around 50 MN/s.

    It can be seen from the table that the realistic ( that is: which can be achieved in a couple of minutes ) depth that can be reached by minimax/negamax is 6 plies on a laptop and 7 plies on Deep Blue.

    Minimax/negamax analyzes every single position which can arise in a given number of plies therefore all positions will have correct value within the tree. However this is not necessary since you can apply a trick. If a move is refuted minimax/negamax will still search on to find out whether it can be refuted even more. However you can stop the search of the refuted move immediately without examining further the possible responses to this move. In this way you can cut branches of the tree. You can do the same thing at every level recursively. This algorythm is called alphabeta. With correct ordering of the moves ( using evaluations of shallow searches for ordering in deeper searches ) enough cuts can be made to reach twice the depth by alphabeta that can be achieved by minimax/negamax. Since alphabeta does not lose relevant information there is no risk that you cut a meaningful branch of the tree. Alphabeta will always find the same best move which would have been found by minimax/negamax.

    By applying alphabeta you can reach a search depth of 12 plies on a laptop and 14 plies on Deep Blue. This means that by today's standards a perfect search can only be performed to a depth of 12 - 14 plies depending on the hardware. Any search deeper than this runs the risk of cutting meaningful branches off the tree. Since today's engines can reach a depth of at least 20 plies in a couple of minutes it is evident that these searches can not be perfect. Any move which yields visible results beyond 12 - 14 plies can fall victim to pruning.

  • 8 months ago · Quote · #59

    rooperi

    I just found this study by Troitzky. White to play and draw. Black can never make progress, White always has a stalemate defense. Stockfish tries different queen manuevres trying to avoid reptition, and evaluates as -5.72 right up to (almost) the end



  • 3 months ago · Quote · #60

    watcha

    This position is of no particular theoretical importance. White is obviously winning here.

    This is how Houdini views this position, it values it at +23 and wants to move the queen:

    Now look at this position from a human point of view. It is clearly visible that the black queen is jailed to defending the mate. It is also clear the if the white queen or rook moves the white king can be subject to checks. So a human would clearly conclude that the safe way is to push the f-pawn slowly up the board: it is untouchable, and at some point will be in a position to take part in the attack.

    When Houdini is asked about this move it immediately switches its evaluation to a forced mate in 9:


Back to Top

Post your reply: