Engines and Fortresses

tlay80

I know engines sometimes mis-evaluate endgames, but how often do they misplay them?

For example, I've seen positions where an engine fails to recognize a fortress and thinks there's a large advantage when the position is actually drawn. But how often will a strong engine miss the opportunity to create a fortress and lose when it should be able to hold a draw.  I'm assuming this must happen as a corrolary of having trouble recognizing fortresses, but can someone post examples from engine games?

wollyhood

Am very glad you posted this.

I am only on 1190 but I just finished this daily game, and did HEAPS of analysis, and the point I chose to make my attack well SF says it was a real blunder.

But I am not sure. I tried all the things they could have done to stop it, and remembering that they are about the same at chess as me, well....

the whole thing just makes me not trust SF very much. They thought I was giving the game away and luckily I didn't have that info - I thought I was winning, and I did.

Here is where it says I blundered:

it did not like me moving my Q here and gave her so much advantage.

this is the game report link https://www.chess.com/analysis/game/daily/281016258?tab=report

sorry to take your thread off topic a bit but yeh. PS I am not talking about her blunder at the very end, but I had given myself lots of ways to win.

tlay80

Hmm, in a middlegame position like this, I'd trust the engine. What's your objection to the engine's view that White's winning after 18 Bc6?

I'm thinknig more about endgame positions where a human can see that, despite one side having extra material, there's no real way to make progress.  Here's one of the more absurd examples, which Stockfish gives as -12, though a human can see there's no way for black to get to mate:

 

But I'm looking not for ridiculous constructed positions like this, but rather for games that an engine actually misplayed, missing a draw or a win because of this sort of blindness.

wollyhood

Yes sorry to go off topic happy.png  I just didn't think it was as much of a blunder as they said, it seemed safe and my attack seemed a little impossible for her to defend but I don't know enough, it's just conjecture on my part. I will just follow this and see what you guys discuss, I might learn something...

MARattigan

Here's Rybka losing from a theoretically drawn position:

and drawing from a theoretically won position:

and Stockfish 8 and 12 respectively drawing in short order from a won position (the variation in the first):

All played with G120+3 (mins/secs) on the clocks on a Pentium CPU J3710 @ 1.6 GHz.

tlay80

Thanks -- I'll give these a look.  If you know of any involving failure to create a fortress, I'd be especially interested.

wollyhood

*blocked*

MARattigan
tlay80 wrote:

Thanks -- I'll give these a look.  If you know of any involving failure to create a fortress, I'd be especially interested.

Here's Rybka already in a KBNKQ fortress, but fails to maintain it. Final position mate in 38 for Black according to Nalimov, but you'll have to finish it yourself because I can't really play this endgame.


 

tlay80

Thanks!

MARattigan

Don't mention it. 

They're not from games, of course, just random basic endgame positions.

Rockroyal

@tlay80 , I think that you could win as black by promoting to knight and checking on f2, so maybe SF's evaluation isn't that crazy.

MARattigan
Rockroyal wrote:

@tlay80 , I think that you could win as black by promoting to knight and checking on f2, so maybe SF's evaluation isn't that crazy.

I assume you're talking about #3. How would you ever get to promote?

If qxB anywhere on a7-g1 but g1 it's stalemate. Black bishop takes any advancing pawn. If qxB on g1 before a pawn has advanced the king can retake and hold the pawn pair or either single pawn.

But maybe you were making the same point as my following post.

MARattigan
tlay80 wrote:

...I'm thinknig more about endgame positions where a human can see that, despite one side having extra material, there's no real way to make progress.  Here's one of the more absurd examples, which Stockfish gives as -12, though a human can see there's no way for black to get to mate: ...

An evaluation of -12 doesn't mean mate. Mate evaluations are preceded by a "#". When you say there's no way for Black to get mate, you really mean there's no way for Black to force mate.

Either Black or White could could mate with inaccurate play by his opponent

but Black's chances of encountering sufficiently inaccurate play are better than White's.

The chess.com analysis evaluates this one at -37 by the way.

 

(But that's not from a game either.)

And Wilhelm evaluates this as +165. It's drawn but possibly not so obviously.

 

(Again not from a game.)

tlay80

Right, but it’s still mis-evaluating the position. Since the engine assumes perfect play from its opponent, no matter how difficult that play is, the proper evaluation would be 0.00. 

So it seems like there ought to be game circumstances where this mis-evaluation in fortress scenarios causes an engine to not choose a proper fortress setup or allow its opponent to create a fortress in what should be a winning position (though if that opponent is an engine, I suppose it may fail to do so).  Those were the sorts of engine mistakes I was interested in. 

MARattigan

The engine doesn't assume perfect play from the opponent.

The numbers printed are used by an engine in its alpha beta pruning algorithm, see e.g. https://www.javatpoint.com/ai-alpha-beta-pruning. The actual values of the numbers are irrelevant, any other numbers would do so long as they're in the same order.

Normally positive numbers are given to what is called static evaluations there if the programmer thinks the position should be won by White, negative if by Black and zero for draws.

The programs have a problem with searching to a depth of more than about thirty moves within a reasonable time in most endgame positions, so even if the fifty move rule is in effect, when they can't search to a forced stalemate or insufficient material the program will probably not see the draw and in many drawn positions these cannot be forced.

The KBNKQ position I gave is an example of what you are saying. White's 2.Kb2 fails to find the fortress. After 2.Ka2 he would have found the fortress.

tlay80

Thanks -- interesting.  Though I'm still finding it hard to understand the engine not assuming perfect play.

MARattigan
tlay80 wrote:

Thanks -- interesting.  Though I'm still finding it hard to understand the engine not assuming perfect play.

The engine (alpha-zero and Leela  excepted) is completely dependent on the human evaluations of the leaf nodes in its algorithm, which generally won't lead to perfect play. The depth limitations imposed by time restraints will further impair play. The alpha beta pruning algorithm itself includes no assumption of perfect play. If you were to set all the static evaluations to the same value it would just produce a random legal move generator.

If you were to evaluate just the mate positions as +1 or -1 and set all other evaluations to 0 it would produce not just perfect but perfectly accurate (DTM or DTM50) play eventually in winning positions given enough time. The problem is you'd have to move your computer before the sun went nova in most winning positions and it would take infinite time in most drawn positions.

tlay80
MARattigan wrote:
tlay80 wrote:

Thanks -- interesting.  Though I'm still finding it hard to understand the engine not assuming perfect play.

The engine (alpha-zero and Leela  excepted) is completely dependent on the human evaluations of the leaf nodes in its algorithm, which generally won't lead to perfect play. The depth limitations imposed by time restraints will further impair play. The alpha beta pruning algorithm itself includes no assumption of perfect play. If you were to set all the static evaluations to the same value it would just produce a random legal move generator.

Okay, but for at least the first two (and maybe your third point too?), the engine still assumes what it *thinks* is perfect play, no?  For the way I was thinking about the problem, that distinction shouldn't make a difference, right?  In fact, it seems like it's implicit that we're talking about what it *thinks* is perfect when we say an engine "assumes" perfect play rather than that it guarantees perfect play.

Of course, maybe I'm completely misunderstanding engines in some fundamental ways. (And obviously, there are ways in which the word "thinks" can be misleading when we're talking about engines, but if it's misleading me here, I could probably use some guidance on how.)

MARattigan

I think maybe you are. Engines are designed to produce perfect play only if they have access to a Syzygy EGTB or if they're playing without the 50 move rule and have access to either a Syzygy or Nalimov EGTB (in the latter case producing perfectly accurate play).

They assume any play from the opponent which would include perfect play, but the perfect moves could be pruned according to the human set leaf evaluations in most engines. The opponent obviously may still make the moves. 

tlay80

Thanks. Okay, I'll keep trying to wrap my head around this.