Stockfish dethroned

Sort:
Elroch
Lyudmil_Tsvetkov wrote:

What graphs?

The match was played at 1 minute fixed time per move.

Stockfish approaching perfection? SF is currently at around 3200 or so, and the perfect player would be at least 5000 elo, maybe much more, this has been discussed frequently on Talkchess.

The graphs in the paper on AlphaZero by the Deep Mind team

We have discussed it here as well over the last 7 years in the thread "the rating of a perfect player".

By approaching perfection, I meant that I would expect a good chess algorithm to asymptotically approach perfect play as the computational resources tend to infinity, ignoring the practicalities of this. The graphs in the paper suggest neither program would reach a much higher rating with huge amounts of computation (this is just my interpretation of the graphs, based on experience, feel free to think otherwise).

 

hairhorn
chesstauren wrote:

Unless google release the details (time controls, CPU speeds etc...) I'm calling shenanigans.

 

Try reading the paper! There's a few objections you might make, but secrecy is not really one of them. 

pfren
chesstauren έγραψε:

Unless google release the details (time controls, CPU speeds etc...) I'm calling shenanigans.

All details have been released.

Stockfish was an old version (8) running with only 1 GB of RAM.

There is no doubt that AI if the future in chess, but smashing a crippled engine (too little RAM, no opening book) is not a proof of anything. You can rather call it another Google marketing trick.

You can add to that the silly time control used in the match.

nighteyes1234

I see the shenanigans but think this one probably is better. Another game it showed it was superior was game 5 as white: queens gambit - polugaevsky gambit. While Stockfish would have claimed it was rigged by being forced to play 19...Qh7 instead of 19....Qe6, I got it to admit it was a draw no matter what after 20 Nc3 vs 19..Qe6.

Thus it looked to be another shenanigan game...however, whites attack was totally missed by Stockfish. Stockfish thought it has an advantage as black the whole way and didnt see 3-4 moves of AlphaZero in the middle game in its top 3 calculations, including 18 h4 which started the attack. Then it thought the attack was bogus until it finally admitted white had a dead draw after 22Qf4. Also, each move of black until move 19 was Stockfish's top choice.

 

 

 

BarronBrowne
chiefonion wrote:

So, how do we apply this technology to real life? If we take one of those Japanese robots that seems eerily real, add AlphaZero to it, we can have a partner that learns to treat us better. We go to work, come home, and it has calculated how to be a better wife or husband. I like this idea.

 

 

 

 

or you come home and it has calculated how much easier its life would be if it srangles you and then buries you in the cellar.

Elroch
pfren wrote:
chesstauren έγραψε:

Unless google release the details (time controls, CPU speeds etc...) I'm calling shenanigans.

All details have been released.

Stockfish was an old version (8) running with only 1 GB of RAM.

I am pretty sure you are wrong. It was a 1GB hash table. While this may be its major RAM use, this was not hardware limited. It had 64 threads, which means 32 cores using hyperthreading.

There is no doubt that AI if the future in chess, but smashing a crippled engine (too little RAM, no opening book) is not a proof of anything. You can rather call it another Google marketing trick.

You can add to that the silly time control used in the match.

The paper on the work shows that the strength of AlphaZero increases more with time per move than Stockfish, so it would get tougher with a longer time control. With less than 0.3 seconds per move, Stockfish was stronger.

 

hairhorn

You don't need 32 cores to run 64 threads. One is enough, although more is obviously better. 

Lyudmil_Tsvetkov
Elroch wrote:
Lyudmil_Tsvetkov wrote:

What graphs?

The match was played at 1 minute fixed time per move.

Stockfish approaching perfection? SF is currently at around 3200 or so, and the perfect player would be at least 5000 elo, maybe much more, this has been discussed frequently on Talkchess.

The graphs in the paper on AlphaZero by the Deep Mind team

We have discussed it here as well over the last 7 years in the thread "the rating of a perfect player".

By approaching perfection, I meant that I would expect a good chess algorithm to asymptotically approach perfect play as the computational resources tend to infinity, ignoring the practicalities of this. The graphs in the paper suggest neither program would reach a much higher rating with huge amounts of computation (this is just my interpretation of the graphs, based on experience, feel free to think otherwise).

 

This is dependent on a lot of stuff, and indeed, infinite computational resources(which btw. simply don't exist) won't solve chess, as no matter how big your hardware and your depth, you still have to pick the right moves, and that is only done on the basis of good evaluation.

So, the evaluation should gradually improve too.

The fact that strength does not improve on the graph could only imply that increasing computational resources are unable to do so, and not that it is altogether impossible to improve. As said, you need more refined evaluation with each new version.

But that whole paper leaves a lot of unclear things, so it is not exhaustive at all.

You can not compare chess.com with Talkchess: on Talkchess, there are at least 3000 programmers.

Elroch
hairhorn wrote:

You don't need 32 cores to run 64 threads. One is enough, although more is obviously better. 

I think you are confusing the number of threads that can run at a specific time with the number of processes or tasks that can be swapped in and out. Modern Intel cores run two threads, and this was an advance over earlier cores that ran one thread. [When running two threads, they are only about 30% faster than running one due to the shared resources].

pfren
Elroch έγραψε:
pfren wrote:

 

I am pretty sure you are wrong. It was a 1GB hash table. While this may be its major RAM use, this was not hardware limited. It had 64 threads, which means 32 cores using hyperthreading.

 

 

And I am pretty sure you are wrong thinking I was wrong.

Stockfish DID run with just 1G RAM, while AlphaZero was running on 4 ΤPUs infrastructure.

Just find the match details, it isn't that hard!

hairhorn

The only real connection between number of threads and number of cores is that Stockfish recommends one thread per core. Other settings are possible but sub-optimal. 

Elroch

pfren, that is ridiculous: you would be hard-pressed to find a PC these days with only 1GB RAM. Provide a link. Meanwhile the paper by DeepMind refers to 64 threads and a 1GB hash table.

More is not necessarily better: here is a graph of performance against hash table size for Houdini:

null

Elroch
hairhorn wrote:

The only real connection between number of threads and number of cores is that Stockfish recommends one thread per core. Other settings are possible but sub-optimal. 

If that setting was used, Stockfish was running on 64 cores, which is a very powerful computer. 32 cores is almost as impressive.

vickalan
pfren wrote:

...smashing a crippled engine (too little RAM, no opening book) is not a proof of anything... 

 

Agree 100%. And Stockfish was on much slower hardware.😐

Lyudmil_Tsvetkov
Elroch wrote:
hairhorn wrote:

The only real connection between number of threads and number of cores is that Stockfish recommends one thread per core. Other settings are possible but sub-optimal. 

If that setting was used, Stockfish was running on 64 cores, which is a very powerful computer. 32 cores is almost as impressive.

Not impressive at all next to what Alpha had.

 

Lyudmil_Tsvetkov

It trained itself to this level of play in only 4 hours.

When you wake up next morning, it will already be 4000, chess will be solved and this forum will shut down. happy.png

Elroch
Lyudmil_Tsvetkov wrote:
Elroch wrote:
hairhorn wrote:

The only real connection between number of threads and number of cores is that Stockfish recommends one thread per core. Other settings are possible but sub-optimal. 

If that setting was used, Stockfish was running on 64 cores, which is a very powerful computer. 32 cores is almost as impressive.

Not impressive at all next to what Alpha had.

 

I keep referring to the graphs from the DeepMind paper.

These strongly suggest that if Stockfish was given thirty times longer per move it would have gained surprisingly few Elo points, and not done much better.

null

Elroch
Lyudmil_Tsvetkov wrote:

It trained itself to this level of play in only 4 hours.

When you wake up next morning, it will already be 4000, chess will be solved and this forum will shut down.

It appears not. Take a look at the graph of the standard of play versus the stage of learning. AlphaZero's standard of play seemed to have plateaued after it got about 100 points stronger than Stockfish.

null

vickalan

Keep in mind that those Elo calculations are based on games with non-equal hardware. Elo's should not be calculated until games with fair play are first conducted.😐

Elroch

Look at the first graph and you will see that both programs appear to have got close to a ceiling of performance, where more computing power doesn't make much difference (more computing power is exactly the same as more time per move with fixed computing power). This is a surprise to me, but the graph tells a very clear story. Note that AlphaZero appears to have a little more upside than Stockfish (the slope is always steeper).