The playing style of StockfishNNUE

Sort:
Avatar of Dyslexic_Goat

The first decisive match of this round robin. Allie 0.6, playing White with no openings book, offered its refutation of the Albin countergambit with itself using an openings book playing Black.

 

Avatar of Dyslexic_Goat

Over at the CCC, results have become more problematic for Stockfish NNUE.

 

It still dominates Stockfish 11, 62.5-59.5, but against LC0, Leela is leading 63.5-57.5. Stockfish 11 is still tied vs LC0 at 61 points each.

 

This confirms what I had suspected, but hoped wasn't the case here: Stockfish NNUE's improvements from fighting itself has led to more self-victories, but it may not be translating to better play against its opponents.

Avatar of infinitless

30 MILLION NODES PER SECOND????????? WOAH!!!!!!!!!!! @Dyslexic_Goat - man what's your rig and what does it look like and cost!!!? I am looking to build some decent hardware but can't get more than a few thousand nodes per second using SF and a few hundred using LC0.

Avatar of Dyslexic_Goat

You're probably getting a few million nodes per second with Stockfish actually. I know Arena and Chessbase lists it by kilonodes when it's over a million, so you multiply the number you see by 1,000 if there's a k/nps by the number. happy.png

 

As for hardware, what we've got here is a Ryzen 9 3900X for CPU and an RTX 2060 for GPU. In total the rig was built on a $2,000 budget before adding two monitors. The Ryzen 3900X is probably the absolute best you can get on the market for that budget. The only options I saw beating it were four-digit CPUS. As for the RTX, upgrading to the 2070 and onward wouldn't hurt too much, so it really depends on how much you want to spend on graphics. I will say that my 2060 can run LC0 at max strength and GTA 5 on ultra graphics on a 4k 32-inch monitor without too much of a drop in framerate, so even with the "budget" 2060 you'll still be a decently happy customer.

 

Note that while LC0 and Allie are both exponentially faster running the GPU version than CPU, they're still not going to analyze very fast or deep. It's just a quirk of how they operate. They do far more efficient analysis on a per-node basis, at the expense of speed.

 

On that note I'll give a nod to Pfren that I've doubled the time control to 2 minutes per move after Allie really started struggling to keep up as the match progressed. 

Avatar of pfren
infinitless έγραψε:

30 MILLION NODES PER SECOND????????? WOAH!!!!!!!!!!! @Dyslexic_Goat - man what's your rig and what does it look like and cost!!!? I am looking to build some decent hardware but can't get more than a few thousand nodes per second using SF and a few hundred using LC0.

 

NPS is a figure which is of real interest only to programmers, but even on very old hardware you can get some higher number- e.g. I was getting some two million nodes per second on an ancient Athlon II x4 640. Are you sure you are not using just one core, and/ or pruning is disabled?

Else, I can only think you are confusing nodes with kilonodes.

The calculation speed is irrelevant to the quality of the analysis- higher nodes mean that the same evaluations are displayed faster. For better quality, you can either use a software manager (like Aquarium's IDEA), or your brains- forwarding the lines which are, to your knowledge, the most consistent/promising/safe. Which means, that you let the engine handle the tactical part, where Stockfish is extremely good indeed, and very fast, compared to other top engines.

So far using the latter approach (tried IDEA as well, did not like the output), and I can't complain at all, very decent results in high level Correspondence Chess. And I am using an old i7-4790K with a monstrous Noctua air cooler, which averages 8 million nodes per second, but in CC speed is not a real issue- plenty of time to work on a position.

I guess that for serious CC analysis something like a Ryzen 3700X with a bulky cooler is fine, and should average some 30 million nodes. I wouldn't use a CPU with higher TDP, there is always the risk of overheating with constant 100% CPU usage, even when using expensive cooling systems. Better spend more time instead of replacing a fried processor! And- needless to say, the stock AMD air cooler should be forwarded to the wastebin.

1,2, or 5 minutes per move is not serious analysis, especially with modern engines, which tend to prune moves a lot.

Avatar of Dyslexic_Goat
pfren wrote:
infinitless έγραψε:

30 MILLION NODES PER SECOND????????? WOAH!!!!!!!!!!! @Dyslexic_Goat - man what's your rig and what does it look like and cost!!!? I am looking to build some decent hardware but can't get more than a few thousand nodes per second using SF and a few hundred using LC0.

 

NPS is a figure which is of real interest only to programmers, but even on very old hardware you can get some higher number- e.g. I was getting some two million nodes per second on an ancient Athlon II x4 640. Are you sure you are not using just one core, and/ or pruning is disabled?

Else, I can only think you are confusing nodes with kilonodes.

The calculation speed is irrelevant to the quality of the analysis- higher nodes mean that the same evaluations are displayed faster. For better quality, you can either use a software manager (like Aquarium's IDEA), or your brains- forwarding the lines which are, to your knowledge, the most consistent/promising/safe. Which means, that you let the engine handle the tactical part, where Stockfish is extremely good indeed, and very fast, compared to other top engines.

So far using the latter approach (tried IDEA as well, did not like the output), and I can't complain at all, very decent results in high level Correspondence Chess. And I am using an old i7-4790K with a monstrous Noctua air cooler, which averages 8 million nodes per second, but in CC speed is not a real issue- plenty of time to work on a position.

I guess that for serious CC analysis something like a Ryzen 3700X with a bulky cooler is fine, and should average some 30 million nodes. I wouldn't use a CPU with higher TDP, there is always the risk of overheating with constant 100% CPU usage, even when using expensive cooling systems. Better spend more time instead of replacing a fried processor! And- needless to say, the stock AMD air cooler should be forwarded to the wastebin.

1,2, or 5 minutes per move is not serious analysis, especially with modern engines, which tend to prune moves a lot.

 

1,2, or 5 minutes per move is not serious analysis, especially with modern engines, which tend to prune moves a lot.

 

Depends on the speed of the hardware. I made the mistake of assuming 1 minute per move would be a fair time control for LC0 & Allie, and they paid the price in the first round of testing. 

As for whether a five minute analysis can be worth anything, I'd say the equation can be relatively straightforward-- a 30 minute analysis on a machine running at 1k/nps is worth the same as a second machine running at 30k/nps. But in the end it's only the quality of output that really matters, etc.

Avatar of infinitless

I just use a laptop with 8 core CPU, Intel i7, 16 GB RAM. NO GPU. Just onboard graphics. I get about 2500 kN/s for SF and only 500 N/s for LC0 - I really am looking for a cheap option to run LC0 on better hardware; but the options of using GPUs are too expensive sad.png Anyway, congratulations on your wonderful setup!

Avatar of Dyslexic_Goat

You won't find cheap [good] things in the laptop world. [edit: though an i7 should be decent for the CPU side] But if you're building a custom desktop rig you can definitely get reasonable results with a $1000-1500 budget right now. 

Avatar of pfren
infinitless έγραψε:

I just use a laptop with 8 core CPU, Intel i7, 16 GB RAM. NO GPU. Just onboard graphics. I get about 2500 kN/s for SF and only 500 N/s for LC0 - I really am looking for a cheap option to run LC0 on better hardware; but the options of using GPUs are too expensive  Anyway, congratulations on your wonderful setup!

 

Something wrong with your UCI config for sure.

I was getting more than 2500 kN/S on an old Lenovo with i3-3110M (2+2 cores, using 3 of them for the engine).

My current 6+6 core i7-9750H Lenovo (it also has an Optimus Nvidia  GTX 1650 about which I don't really care) tops 10M nodes, but I don't use it for serious analysis (having the CPU constantly at 90 deg Celsius is definitely a bad idea).

Avatar of pfren
Dyslexic_Goat έγραψε:

You won't find cheap [good] things in the laptop world. [edit: though an i7 should be decent for the CPU side] But if you're building a custom desktop rig you can definitely get reasonable results with a $1000-1500 budget right now. 

 

Mid-budget laptops with 8+8 cores Ryzen 7 4800U, which should be able to pull some 16M nodes are already available. Still, a laptop is not good for serious analysis, just occasionally.

Avatar of Dyslexic_Goat

That’s not bad honestly. My previous desktop had an i7 4600K if I remember right. On 8 cores it was only benchmarking 5-8 million nodes per second.

 

The issue now with laptops is the new arms race. Before it was Stockfish with Houdini and Komodo, but at least they all ran on the same hardware.
Now though, it’s far too early to say who between the top three will have the highest rate of skill growth, leaving someone with only CPU resources at a potential disadvantage.

 

Between the Discord channels of Stockfish NNUE and LC0, there’s new neural nets being posted basically every other day, then a bunch of volunteer testing to figure out if the new net(s) show any statistical differences in skill. The arms race is real.

 

 

Avatar of nighteyes1234
Dyslexic_Goat wrote:

The issue now with laptops is the new arms race. Before it was Stockfish with Houdini and Komodo, but at least they all ran on the same hardware.
Now though, it’s far too early to say who between the top three will have the highest rate of skill growth, leaving someone with only CPU resources at a potential disadvantage.

 

Between the Discord channels of Stockfish NNUE and LC0, there’s new neural nets being posted basically every other day, then a bunch of volunteer testing to figure out if the new net(s) show any statistical differences in skill. The arms race is real.

 

I nominate you for secret chess player on the team NNUE or whatever its called.

I suggest reading drmrboss posts about Leela for what to brag/post..just substitute out Leela.

Could even go back further and use Lyudmils posts....for example post a 1+1 game as definitive proof...just  let NUUE have unlimited takebacks, time, and saves....while really giving the other engine 15 secs for the game.

Avatar of pfren
nighteyes1234 έγραψε:
Dyslexic_Goat wrote:

The issue now with laptops is the new arms race. Before it was Stockfish with Houdini and Komodo, but at least they all ran on the same hardware.
Now though, it’s far too early to say who between the top three will have the highest rate of skill growth, leaving someone with only CPU resources at a potential disadvantage.

 

Between the Discord channels of Stockfish NNUE and LC0, there’s new neural nets being posted basically every other day, then a bunch of volunteer testing to figure out if the new net(s) show any statistical differences in skill. The arms race is real.

 

I nominate you for secret chess player on the team NNUE or whatever its called.

I suggest reading drmrboss posts about Leela for what to brag/post..just substitute out Leela.

Could even go back further and use Lyudmils posts....for example post a 1+1 game as definitive proof...just  let NUUE have unlimited takebacks, time, and saves....while really giving the other engine 15 secs for the game.

 

Actually there are MANY players in Correspondence chess which have bought a strong rig, and think exactly like Dyslexic Goat. You can easily tell, when they reply within a couple of hours, and they play what the engine initially suggests as strongest move.

I am very happy playing against them- they are the best and most profitable client base for a serious correspondence player. Their moves are always predictable, and you just have to figure out what the engine has anticipated wrongly because its prowd owner was fool enough to let it run for just a dozen of minutes. usually it is not THAT tough- you just have to analyse seriously, and you will find good chances many times.

Avatar of Dyslexic_Goat
pfren wrote:
nighteyes1234 έγραψε:
Dyslexic_Goat wrote:

The issue now with laptops is the new arms race. Before it was Stockfish with Houdini and Komodo, but at least they all ran on the same hardware.
Now though, it’s far too early to say who between the top three will have the highest rate of skill growth, leaving someone with only CPU resources at a potential disadvantage.

 

Between the Discord channels of Stockfish NNUE and LC0, there’s new neural nets being posted basically every other day, then a bunch of volunteer testing to figure out if the new net(s) show any statistical differences in skill. The arms race is real.

 

I nominate you for secret chess player on the team NNUE or whatever its called.

I suggest reading drmrboss posts about Leela for what to brag/post..just substitute out Leela.

Could even go back further and use Lyudmils posts....for example post a 1+1 game as definitive proof...just  let NUUE have unlimited takebacks, time, and saves....while really giving the other engine 15 secs for the game.

 

Actually there are MANY players in Correspondence chess which have bought a strong rig, and think exactly like Dyslexic Goat. You can easily tell, when they reply within a couple of hours, and they play what the engine initially suggests as strongest move.

I am very happy playing against them- they are the best and most profitable client base for a serious correspondence player. Their moves are always predictable, and you just have to figure out what the engine has anticipated wrongly because its prowd owner was fool enough to let it run for just a dozen of minutes. usually it is not THAT tough- you just have to analyse seriously, and you will find good chances many times.

I'm not going to claim to be a serious correspondence player. My own chess ability's too much of a hindrance to help my rig. I've never truly played to win, especially in my own matches.

Are they always predictable, though?


It's moments like this, when all of a sudden White decides d5!! is a winning move-- it's moments like that which are why I explore what our engine overlords think. For me, it's never going to be about winning or even figuring out which engine is the "best" -- that answer is changing by the hour, assuming an answer even exists -- it's more about finding ideas that hadn't been considered by mortals before. It can take exploring thousands of lines with several different engines to find something new, but that's just part of what makes it rewarding. 

 

[post edited, setup on diagram was incorrect]

Avatar of Dyslexic_Goat

Allie+Stein 0.8, net 15, no openings book, vs Stockfish NNUE 20200728-0633.

 

The Scandinavian's never been a popular opening for engines, who never really saw the point of it for Black's side. At the same time however, the scores don't always reflect this pessimism -- while an engine playing Black in this opening may be blessed to find a win anywhere, there's still plenty of drawing chances. Unfortunately not so today.

 

Avatar of pfren

You seem very confused. At move 7 you write "a novelty for my opening theory" (true, there are only 152 games like that in correspondence chess alone!) and one move later, you write "mainline is h3".

But the funny part comes later:

 

White gains at least a tempo here.

Here, you even fail to realize a trivial thing: That ...Bh5/ ...Bxf3 and Qe2/Qxf3 is EXACTLY the same as a direct ...Bxf3, as both sides have lost a tempo.

And of course, you fail to mention that voluntarily giving the bishop pair to your opponent without an apparent need to take at f3 and/or gaining some positional benefits does not make sense strategically, and it is just the result of an engine assuming nonsense because of the limited time/ply depth allowed.

And it is here no human has gone before.

Sure, although even the freely available online Chessbase database features 14 games starting from here.

Do you expect your analysis to be taken seriously with simple misses like this one?

Avatar of aidan_mclau

What I find interesting is that NNUE acknowledges the superiority of the neural network based evaluation function, but not the policy function. Remember, not only does LC0 use a neural network to judge how good the position looks at one node, but it also uses a network to choose which paths to analyze down the road. Regarding the evaluation function, both LC0 and AlphaZero have demonstrated the superiority of it: so much so that now Stockfish is adopting a neural net based evaluation function of its own. The thing that I find interesting is that NNUE chooses to keep the neural net based evaluation function, but not the policy function. Will the neural network triumph for the policy function as well? I guess we need more testing of LC0 vs NNUE to find out!

Avatar of Dyslexic_Goat

Absolutely! 

There's also an important detail here in the NNUE project: the fact that it's easy enough to set up new eval files, that entirely new engines are being formed from other coders experimenting with it. Several brand-new ideas have been tried already. The new engines borrow Stockfish's exe file and structure, but the eval functions are trained on different positions.
The most exciting so far appears to be one named NightNurse 0.1. I had read about it but was skeptical until it did exceptionally well in testing. Apparently, it is a neural net trained on LC0 matches, creating a playing style that the author claims "is as far from Stockfish as you can get". Debatable considering the hardware the two engines share, but on the board it does have a much different playing style. Not quite as aggressive as Stockfish NNUE, but still nasty to watch.

 

In terms of pure ELO differences, StockfishNNUE's Sergio net is dominating. This is the one I've been running in my own analysis and will generally refer to it just as "Stockfish NNUE". 

There are a few others. Someone built a neural net that trained by fighting Toga II over and over, then named the net Toga III. Lol. There's another, LizardFish, trained by beating up Komodo. NNUE stands for "neural net, updated efficiently", and so far its design has proven so strong that it can inspire creativity from other projects. 

Very fun to observe!

Avatar of Dyslexic_Goat

A coder more familiar with the behind-the-scenes world has mentioned a few other points of interest:

 

Stockfish NNUE, so far, was an experiment. A deviation from the rest of the Stockfish project. To date, it hasn't actually been merged with Stockfish's main resources.

 

Why is this a big distinction?

 

As it turns out, as part of Stockfish's development resources, Stockfish runs a volunteer network of 25,-000-45,000 CPU cores when they want to test experimental development builds to see if minor code changes produce an ELO gain. 

However, NNUE has not been part of the CPU testing network, referred to as Fishtest, because its code has had a few differences between NNUE & Stockfish [main] that need to be reconciled before it can work with Fishtest.

 

It is also said that once Stockfish NNUE has been reconciled with SF 11/dev, it will then merge into Fishtest and use that CPU network for self-training.

 

All heck will soon break loose. Not a single word from the developers of Houdini/Komodo, but LC0's Discord channel is starting to catch on that a new wrecking ball is gaining momentum. 

All fantastic news for the end user!  All a horrifying nightmare for anyone trying to keep up with Stockfish! Stay tuned! grin.png

Avatar of Dyslexic_Goat

It is finished. What amazing times we live in.

 

Stockfish NNUE has been officially integrated into Stockfish's dev cycle. And the tsunami began almost immediately: in the last 24 hours SF dev has been updated no less than eight times, for a total ELO gain of +127 over the July 31st version.

 

Today's gain will be one for the record books!