Major ELO gain achieved by SF 15.1 Dev

Thank-you for posting your test results. Test results are impacted by a number of factors, even the people at the CCRL list warn that every estimated ELO is subject to a "+" / "-" tolerance. I've read that one key aspect to consider is the neural network file, and that very recently (even today!) both SF and the NNUE file have been optimized for speed, which should - as they claim - bring the strengt increase. The latest file to use is the one that today's SF claims as default net when the UCI command is given.

Again: what NNUE file did you use? You should compare the original "15.1" with its specific NNUE file - the one that such version claims to be used when UCI command is given - and the very latest SF 15.1 dev with the very most recent NNUE file. You should not mix things. I suspect you did not perform the test as I'm describing it - am I correct?

Then I don't know why your results show a difference of 4 points, while SF' dev team tests speak about 10 points. I don't have enough information to say which is right and why. I enjoy the fact that this version is stronger than the previous one, I'm happy for the SF team and for us of course. Maybe it's going to be called 15.2 instead of 16, if 16 is "too much" for the increase it's having now. Given what I use SF for - which is support me in analysing mostly my own games, I'm sure I would not benefit of having +10 ELO points instead of +4. That's all I can say if you have a strong opinion that SF Dev test results are inflated or false.
I still can't understand why they would claim a false information, if it is enough that someone from home performs a 100-games test to prove they're lying?

I run a live chess engine testing channel on YouTube. When I get off work I will setup a live test in real time. And broadcast it here.
Interesting! Yes please, appreciate.

I'm very impressed by the hw configuration. I've joined your channel, thank-you for broadcasting. Question: have you ever tried to reach out to the SF dev team to speak about the test framework? This may clarify why your tests are giving a result, and theirs, another. Generally, open discussions with a positive attitude, enrich both party's knowledge - IMHO.


Something I have noticed about the recent development versions is that the file is a good bit larger (68mb compared to ~46mb in the past) suggesting a larger NNUE file, but yes that doesn't necessarily mean that it is better although it is likely that it is as +20 elo is really not that dramatic a difference.
Today a breakthrough gain in strenght was achieved by the Stockfish team.
We're used nowadays that pretty much every couple days a new 15.1 Dev (or 15.1+) is released, with an average of a fraction of an ELO point gain between subsequent versions.
The latest SF released today sports an impressive 10 ELO points gain by means of an approx 10% gain in speed, depending on the CPU on which it runs.
Source code and Binaries: https://www.abrok.eu/stockfish/
Kudos to the SF developers!