Major ELO gain achieved by SF 15.1 Dev

Sort:
Avatar of agatti1970

Today a breakthrough gain in strenght was achieved by the Stockfish team.

We're used nowadays that pretty much every couple days a new 15.1 Dev (or 15.1+) is released, with an average of a fraction of an ELO point gain between subsequent versions.

The latest SF released today sports an impressive 10 ELO points gain by means of an approx 10% gain in speed, depending on the CPU on which it runs.

Source code and Binaries:  https://www.abrok.eu/stockfish/

Kudos to the SF developers!

Avatar of yolosolo123
Wow
Avatar of agatti1970

Thank-you for posting your test results. Test results are impacted by a number of factors, even the people at the CCRL list warn that every estimated ELO is subject to a "+" / "-" tolerance. I've read that one key aspect to consider is the neural network file, and that very recently (even today!) both SF and the NNUE file have been optimized for speed, which should - as they claim - bring the strengt increase. The latest file to use is the one that today's SF claims as default net when the UCI command is given.

Avatar of agatti1970

Again: what NNUE file did you use? You should compare the original "15.1" with its specific NNUE file - the one that such version claims to be used when UCI command is given - and the very latest SF 15.1 dev with the very most recent NNUE file. You should not mix things. I suspect you did not perform the test as I'm describing it - am I correct?

Avatar of agatti1970

Then I don't know why your results show a difference of 4 points, while SF' dev team tests speak about 10 points. I don't have enough information to say which is right and why. I enjoy the fact that this version is stronger than the previous one, I'm happy for the SF team and for us of course. Maybe it's going to be called 15.2 instead of 16, if 16 is "too much" for the increase it's having now. Given what I use SF for - which is support me in analysing mostly my own games, I'm sure I would not benefit of having +10 ELO points instead of +4. That's all I can say if you have a strong opinion that SF Dev test results are inflated or false.

I still can't understand why they would claim a false information, if it is enough that someone from home performs a 100-games test to prove they're lying?

Avatar of agatti1970
DesperateKingWalk ha scritto:

I run a live chess engine testing channel on YouTube. When I get off work I will setup a live test in real time. And broadcast it here.

Interesting! Yes please, appreciate.

Avatar of agatti1970

I'm very impressed by the hw configuration. I've joined your channel, thank-you for broadcasting. Question: have you ever tried to reach out to the SF dev team to speak about the test framework? This may clarify why your tests are giving a result, and theirs, another. Generally, open discussions with a positive attitude, enrich both party's knowledge - IMHO.

Avatar of agatti1970
Today at this address, CCRL is showing the latest SF dev having approx 20 ELO points more than 15.1 on long time controls. http://ccrl.chessdom.com/ccrl/4040/cgi/compare_engines.cgi?family=Stockfish&print=Rating+list&print=Results+table&print=LOS+table&print=Ponder+hit+table&print=Eval+difference+table&print=Comopp+gamenum+table&print=Overlap+table&print=Score+with+common+opponents
Avatar of agatti1970
Not sure how to interprete this or yours or others tests then.
Avatar of drdos7

Something I have noticed about the recent development versions is that the file is a good bit larger (68mb compared to ~46mb in the past) suggesting a larger NNUE file, but yes that doesn't necessarily mean that it is better although it is likely that it is as +20 elo is really not that dramatic a difference.