Daniil Dubov / Photo: FIDE/Lennart Ootes

Same Stockfish, same depth, different computer: same evaluation?

NM HanSchut

Updated: Jan 2, 2020, 5:32 AM | 19

If you have a slower computer but give it more time to reach the same depth, do you think the engine will provide you with the same recommendation? When using the Stockfish on a website, like chess.com, do you know the processor speed, or do you mostly rely on analysis depth to assure the quality of the analysis?

A slower device can lead you to a different place at the same depth and I would like to illustrate that by the following example.

Position after 7. h4

In preparing for my opponent at the Sunway Sitges Chess Festival, I was looking at the move 7.h4. The always creative and inspiring Daniil Dubov introduced this move at the highest chess level in his game against Boris Gelfand in 2016. After Gelfand’s 7… Na5 8.f4 d6 9.f5 White won in a crushing kingside attack.

But what happens if Black challenges the knight immediately with 7....h6? This move was played in the game between Sasikiran and Harika at the Isle of Man Grand Swiss 2019. White continued with the strong 8.Nd5! threatening Nxf6 followed by Qh5 and Qg6, utilizing the fact that Bc4 is pinning the pawn on f7. Poor Harika, it must have been tough to face this preparation over the board!
After 8… Nxd5 9.Bxd5! the following position arises:

Position after 9.Bxd5

This position will be the focus of our analysis using the standard Stockfish 10 (not a development version) with contempt set to 0.
On my 2016 Surface book using 1 CPU (2 threads) and on depth 35, Stockfish recommends 9… hxg5 and assesses that Black is slightly better (node count of 400 million).

Om my 2019 Razer Blade Studio, using 6 CPUs (12 threads), and on the same depth 35, Stockfish recommends 9… Nb4 and considers the position equal (node count of 2 billion).
Please note that the standard depth for game analysis on chess.com is 20, so 35 is a reasonable depth.
Yet, after 9... hxg5, Black is completely lost! After the more or less forced: 10.hxg5 g6 (to prevent Qh5) 11.Qg4 Kg7 12.Rh7+ Kxh7 13.Qh4+ Kg8. 14.Bd2 followed by 0-0-0 there is no defence against Rh1 and mate.

Analysis position after 14.Bd2

The slower processor recommends a completely losing variation at depth 35 while the stronger processor and Harika played the superior 9... Nb4!

Why is this is the case? The answer is quite straightforward. The tree of possible moves grows exponential with the depth of the analysis. Since version 7, Stockfish not only uses processing power to go deeper but also to go wider in the search tree. By going wider in the tree, the quality of the analysis improves.
This example shows that analysis depth, time to depth are of relative importance and that the quality of an engine recommendation also heavily depends on the speed of the processor, and available memory. Are you aware of the relative processor speed /memory in your phone, tablet, computer, and the impact on the quality of the engine analysis? Do you know how many resources your device allocates to your game analysis? Do you know how fast the processor speed is when you use a web plugin (like on chess.com)? Is this analysis running on a cloud or local?

I also looked at which depth the device/website stops recommending the losing 9... hxg5.

Some results on popular websites:

Follow Chess / Analyze This: depth 40 (using iPhone 11 Pro Max)

Lichess (Stockfish 10+ WASMX): depth 32 (on Razer Blade Studio)

Chess.com Self Analysis (Stockfish.js 10): depth 37 (on Razer Blade Studio)

I am very interested to learn from the readers at what depth and after how much time Stockfish 10 no longer recommends the losing 9... hxg5. Please specify which device you are using and which program/website.

Chessify, who offers cloud engine services, ran a match between 3 computer configurations: 1 CPU, 4 CPUs, and 50 CPUs at a forced depth of 20 moves for each move. The 50 CPU configuration scored 81% versus the 1 CPU and 73% against this 4 CPU at equal depth. This performance corresponds to a respective rating difference of 240 and 170 rating points. For more information, please read this interesting Blog Entry about Lazy SMP, Nodes per second versus Time to Depth. Same engine, same depth, 200 rating point difference!

It does not come as a surprise that the current standard for professional analysis includes:
- Cloud analysis using multiple engines: Lc0 (derivatives). Stockfish, Komodo, Houdini
- Database from correspondence, engines and GM games with a focus on games from recent years AlphaZero and the recent neural network engines are leading to a reevaluation of certain positions. Games from before 2015 have become of little analytical value for top players.

Finally, let’s return to the game of Saskiran – Harika.

Position after 10.Bb3

After 9.Bxd5 Nb4, Saskiran continued 10.Bb3 and Harika continued 10... d5.

Your Christmas-puzzle: could Harika have played 10... hxg5, or is this still impossible?

Merry Christmas! Wishing you love and good health.

Same Stockfish, same depth, different computer: same evaluation?

HanSchut’s Blog