WHY ARE COMPUTERS BAD AT EVALUATING END GAMES?

Sort:
jesterville

The general thinking is that computers are bad at analysing "end game" scenarios...I don't understand this...the permutaions and combinations are far less....thus it should be easier.

Can anyone explain this?

Spiffe

Well, if you get far enough into an endgame that computers can start using their endgame tables, they play perfectly.

But I think you're asking about earlier in the endgame, when there is less material than a middlegame but still a nontrivial amount.  The reason computers are relatively weaker there than in other parts of the game is that the specific variations are of lesser importance to abstract strategic principles.

Silman gives a nice example of this in his endgame book -- he was in a position against a '90s-era computer in which there was a locked pawn structure in the middle, and each side had a bishop and a knight.  The computer, programmed to think that bishops are better than knights, allowed an exchange of Silman's bishop for its knight.  That produced a strategically won game for Silman, because a knight is more valuable in an endgame scenario like that.

A modern computer would not make that same mistake, but the point is that the win was based on a strategic idea, not a particular variation or position that the computer could brute-force to identify as superior.  It might take 30 or 50 moves to realize your advantage, beyond the event horizon of what a computer could do.  A human has no need to evaluate that deep, but a computer would have to, to be able to make a judgement, and can't.

jesterville

I think I understand what you've said...but follow me...whatever parameters are given are throughout...so if the computer sees bishops as more valuable than knights...this should show up throughout the game.

What I have a problem with is that the general feeling I get while reading blogs is that computers are bad at evaluating end games...but I have problems understanding this...because there are less pieces on board, way less moves possible to calculate etc....why would a computer be good at evaluating 32 pieces and their astronomical possible movements  say after move 10...but have difficulty in evauating say a total of 10 pieces with far less possible movements??? Maybe someone with programming experience can help?

LiquidEggProduct

Keep in mind that in an endgame, since the board is so open, there may be more legal moves to consider than a crowded middlegame with a lot of pieces.

For example, when you start off the game, there are 20 possible moves.  But on an open board, even if you have just a Rook on a1 and a King on e5, there are 22 possible moves!

orangehonda

Endgames are very strategic, in many endgames material count doesn't matter like it does in the middle game -- for example humans know in opposite bishop endgames pawns are worth very little while open lines for your bishop and king activity are what matters.  This is a classic example but other late middle game position where the endgame may be drawish with one set of minors vs be winnable with a different set would require basic endgame knowledge or a brute force calculation of 40+ moves.

This depth of search is simply impractical for any other position, and so it's not programmed to ever go this far.

To say computers play well when they have endgame tablebases is a misnomer.  It's data retrieval and we all know they can do this.  The engine itself stops playing when the TBs are reached.

jesterville

But if you play any computer it will play you very well right to the point of checkmating you...won't it?

VLaurenT

Present day computers are strong in endgames as they are in other parts of the game. This weakness was maybe real 10 or 15 years ago...

However, there are some specific positions where the computer without tablebases may give a bad evaluation (for example a R+p vs. R drawn endgame evaluated as advantageous for the side with the extra pawn), but it can still play the position pretty well... Smile

orangehonda
hicetnunc wrote:

Present day computers are strong in endgames as they are in other parts of the game. This weakness was maybe real 10 or 15 years ago...

However, there are some specific positions where the computer without tablebases may give a bad evaluation (for example a R+p vs. R drawn endgame evaluated as advantageous for the side with the extra pawn), but it can still play the position pretty well...


While it's technique is sometimes very lousy, you're right to say it's play is still relatively strong.  The main complaint is that they give poor evaluations.

When practicing rook endgames against my computer for example many times it has a winning idea, but keeps letting me get into a drawn position because it may evaluate two moves (one winning one that gives me drawing chances) as equal.  If I help it find an accurate move here or there, and it gets past those 1 or 2 positions, it plays very very well.

Once I tried to draw N+P vs R+P , pawns on same file and the computer was ruthlessly accurate.  (On adjacent files not as hard to find a draw).  In other positions it bumbles around.  The main complaint is inaccurate evaluations, especially noticeable in early endgame positions.  It's a modern program btw, Rybka3.

DarkPhobos

I saw a good example recently while using Rybka to review one of my completed games.

The starting position was Q+R+4P (afgh) versus Q+R+4P (efgh). My opponent Black had some advantage due primarily to the fact that his extra kingside pawn made his king safer than mine. My pieces are active and I should not lose but I'm not happy either. My counterplay consists primarily of perpetual check threats that occupy his pieces enough to protect my king. In short I am playing only for the draw and a careless mistake could lead to a quick loss.

A strong human would easily recognize that any winning chances Black may have come from the Q+R ending. Trading either major piece without obtaining a major concession means an easy draw.

Nevertheless Rybka approved of my opponent exchanging queens and entering a rook ending with no change in the pawn structure. Black is still better because my "outside passed pawn" cannot be mobilized. It is back at a2 and being defended from the side. It isn't practical to send my king all the way to the queenside because meanwhile Black would create an advanced passed e-pawn supported by a centralized king.

Rybka had no idea why I would play h4 at my earliest opportunity. It's a standard strategic move but its purpose is far beyond the horizon.

Rybka thought that it was terrible that I sacrificed my a-pawn by playing Rc2-c7. It changed the evaluation form about -0.2 to -0.8. But with a perfect pawn structure and ideal piece placements the 4-versus-3 rook ending on one side is an easy theoretical draw. But the a-pawn is not helping me at all and defending it means remaining passive while the enemy forces advance and encroach upon my position.

As the end of the game approached and Black's progress ground to a halt, Rybka stuck to its -0.7 assessment of the position. Rybka's calculation powers are quite sufficient to see that all attempts to make progress only lead to perpetual check, simplifying pawn exchanges, drawn pawn endings, and other useless outcomes. At this point a human would realize that Black's advantage is gone.

Rybka sees matters differently. Its evaluation function likes Black's current position because he has a healthy extra pawn and more space. It also sees that by shuffling its rook around Black can create lots of similar positions with a healthy extra pawn and more space. Enough favorable positions to go all the way to the horizon without ever having to face the fact that nothing is really changing. So it declares that one of these useless rook moves is best and proclaims a big advantage.

Insane_Chess

Because computers suck at strategy. They can calculate tactical lines well, but they suck at overall planning.