Stockfish dethroned - Chess Forums - Page 4

redreoicy · 2017-12-05T21:53:19-08:00

Deepmind has released a paper with a generalized alphazero that plays go, shogi, and chess. It scored 28 wins with 0 losses in a 100 game series against stockfish. Here is one of its wins with black. Of note is that alphazero uses another form of tree search than alpha-beta pruning, and uses no ta

Lyudmil_Tsvetkov

Dec 8, 2017

0

#61

Elroch wrote:

Lyudmil_Tsvetkov wrote:

Elroch wrote:

hairhorn wrote:

The only real connection between number of threads and number of cores is that Stockfish recommends one thread per core. Other settings are possible but sub-optimal.

If that setting was used, Stockfish was running on 64 cores, which is a very powerful computer. 32 cores is almost as impressive.

Not impressive at all next to what Alpha had.

I keep referring to the graphs from the DeepMind paper.

These show that if Stockfish was given thirty times longer per move it would have gained surprisingly few Elo points, and not done much better.

This is already bogus.

Doubling time is usually more important than doubling speed, and this is significantly more time.

I guess they have measured something wrong. We should not believe everything as they wrote it.

Lyudmil_Tsvetkov

Dec 8, 2017

0

#62

Elroch wrote:

Lyudmil_Tsvetkov wrote:

It trained itself to this level of play in only 4 hours.

When you wake up next morning, it will already be 4000, chess will be solved and this forum will shut down.

It appears not. Take a look at the graph of the standard of play versus the stage of learning. AlphaZero's standard of play seemed to have plateaued after it got about 100 points stronger than Stockfish.

That is what I am saying: this project is dead.

From 0 to 2000 elo it was extremely easy to tune their parameters: a 2000 player, no matter if human or machine, needs to know only about piece values, psqt, weak pawns, shelter attacks and passers.

From 2000 to around 2600 you need further positional terms, like all imaginable tactical checks and pins, connected passers, some imbalances, etc.

Beyond 2600, however, it gets really difficult, as one has to constantly refine and widen one's evaluation, and that evaluation is mostly unobvious. They reached 2800(3200-400 for hardware) on single core, and then they plateaued. Further on, the evaluation is extremely unobvious.

And I would say, almost impossible to tune automatically.

That is why we will not see a stronger version in the next very long period of time.

Again, they reached 2800 on single core, and that is full 400 points below SF, so a random middle tier engine evaluation.

That is why I am upset by their claims: it was all hardware.

The claim is they enhanced intelligence, be it artificial or not: if the evaluation(intelligence) of that engine is a middle-tier one, what kind of an intelligence breakthrough is this?

Elroch

Dec 8, 2017

0

#63

You seem to misunderstand some things: there was no chess knowledge being used: AlphaZero worked all this out for itself. Secondly, AlphaZero was about 100 Elo points weaker with only 1 second per move - equivalent to making its hardware slower than that used by Stockfish - so it might still have won (with a closer margin) without using the TPUs that google designed for general purpose neural networks. (It is great that they have done this: parallel matrix processing will be such a boon for demanding mathematical computing!). But more important to which is really the best program is that if Stockfish had 30 times more time per move (equivalent to speeding up its hardware) it would not have got much stronger (according to my extrapolation of the graphs).

Elroch

Dec 8, 2017

0

#64

Lyudmil_Tsvetkov wrote:

Elroch wrote:

Lyudmil_Tsvetkov wrote:

Elroch wrote:

hairhorn wrote:

The only real connection between number of threads and number of cores is that Stockfish recommends one thread per core. Other settings are possible but sub-optimal.

If that setting was used, Stockfish was running on 64 cores, which is a very powerful computer. 32 cores is almost as impressive.

Not impressive at all next to what Alpha had.

I keep referring to the graphs from the DeepMind paper.

These show that if Stockfish was given thirty times longer per move it would have gained surprisingly few Elo points, and not done much better.

This is already bogus.

Doubling time is usually more important than doubling speed, and this is significantly more time.

They are exactly the same thing!

If you have two CPUs and one runs twice as fast, you get the same results by giving the slow one twice as much time. That is the definition of speed!

I guess they have measured something wrong. We should not believe everything as they wrote it.

prusswan

Dec 8, 2017

0

#65

The only question now is determining whether AlphaZero's plateau is due to inherent limitation of its methodology and available compute resources, or it is really playing near-perfect chess at all stages.

Elroch

Dec 8, 2017

0

#66

That is a really interesting question.

I should say that it is a bit reckless to assume that this plateau is close to its limit. Neural networks are non-linear and it is now understood that it is more common for them to get temporarily stuck in plateaus of their objective functions, and when this happens a lot more training might suddenly find rapid change again. This actually happens once on the graph, just before 300,000 steps. A plateau appears to form very near to Stockfish's rating and then there is a brief acceleration to about 100 points stronger.

This type of behaviour makes my comments suggesting the later plateau is significant very questionable.

Another interesting question is why Stockfish appears to be reaching a limit in strength lower than what AlphaZero reached. There we know it cannot be achieving perfect play! It is possible there is something in the Stockfish architecture that is fine for all normal use, but which obstructs indefinite improvement. I don't believe hash table size can actually do this, but it might slow improvement a lot. That is because a hash table is just an efficiency technique, that makes it quick to recognise positions that have already been seen in analysis, so there is no repetition of analysis.

DiogenesDue

Dec 8, 2017

0

#67

These results won't mean anything until the games are played publicly and with Stockfish at full strength, with opening book plus reams of hash memory, instead of reducing it to 1GB.

There's only 1 reason that this Stockfish match was playing privately and under these limitations: The Alpha team had to cherry pick settings that allowed them to win convincingly.

DiogenesDue

Dec 8, 2017

0

#68

Lyudmil_Tsvetkov wrote:

What graphs?

The match was played at 1 minute fixed time per move.

Stockfish approaching perfection? SF is currently at around 3200 or so, and the perfect player would be at least 5000 elo, maybe much more, this has been discussed frequently on Talkchess.

You/they are mistaken. Nobody can reach 5000 elo. It requires a pool of 4600 rated players to play and beat every single time to reach 5000 elo. Since current engines cannot play past 3600 performance levels, it's entirely impossible to reach 5000.

Consider the elo rating pool like the big bang followed by universal expansion ala a balloon inflating. If the ballon (i.e. rating pool) is only 3600 meters across, then nobody can be out at 5000. It will take a lot of ratings inflation to get to 5000 ...

Elroch

Dec 8, 2017

0

#69

Your logic is fallacious.

Suppose you had applied it to the first computers that reached around 2800. You would say since current engines cannot get past 2800, it is impossible to get past 3400. But now current engines have exceeded 3400. The problem is you don't have any direct way of determining how close to perfection a player is.

[You also don't quite understand the Elo rating system. 400 points difference is not supposed to be a certain win: indeed no rating difference is a certain win. Rather the expected score gets nearer to 1 as the difference increases. If you had an engine which wiped out all current engines, its rating would continue to rise without upper bound (but slower and slower) as the computer kept slightly exceeding its expected score against existing players, even if these existing players were all limited to some level. It's rating could rise much faster if it beat higher rating opponents, but all wins help].

Elroch

Dec 8, 2017

0

#70

btickler wrote:

These results won't mean anything until the games are played publicly and with Stockfish at full strength, with opening book plus reams of hash memory, instead of reducing it to 1GB.

As I illustrated with a graph, bigger is not always better with hash tables, and the ideal size depends on the engine.

sammy_boi

Dec 8, 2017

0

#71

I'd still like to see an AZ match with a top engine that has a good opening book, and EGTB. All the bells and whistles.

This wouldn't be AI anymore, just a spectacle for chess players, so I don't expect it to happen. But it would be cool.

sammy_boi

Dec 8, 2017

0

#72

Elroch wrote:

btickler wrote:

These results won't mean anything until the games are played publicly and with Stockfish at full strength, with opening book plus reams of hash memory, instead of reducing it to 1GB.

As I illustrated with a graph, bigger is not always better with hash tables, and the ideal size depends on the engine.

That's not totally fair though, as the graph was made with SF on a 1GB hash. I don't know how fast 32 cores could fill 1GB of hash, but less than a few seconds wouldn't surprise me.

This would certainly flatten out the graph earlier than otherwise.

hairhorn

Dec 8, 2017

0

#73

Is there anyone making real money off chess who does anything else?

DiogenesDue

Dec 8, 2017

0

#74

Elroch wrote:

Your logic is fallacious.

Suppose you had applied it to the first computers that reached around 2800. You would say since current engines cannot get past 2800, it is impossible to get past 3400. But now current engines have exceeded 3400. The problem is you don't have any direct way of determining how close to perfection a player is.

[You also don't quite understand the Elo rating system. 400 points difference is not supposed to be a certain win: indeed no rating difference is a certain win. Rather the expected score gets nearer to 1 as the difference increases. If you had an engine which wiped out all current engines, its rating would continue to rise without upper bound (but slower and slower) as the computer kept slightly exceeding its expected score against existing players, even if these existing players were all limited to some level. It's rating could rise much faster if it beat higher rating opponents, but all wins help].

Ummm...you are aware that the elo rating system is a relative rating system and only measures how someone in a pool will fare against someone else in that pool, right? Elo ratings are not tied to actual chess skill, nor to chess at all. You can make a rating pool for almost any competitive endeavor using it. If only 5 year olds played chess, someone would still have Carlsen's rating eventually.

You are proving my point. When the first engines were 2800, it was impossible to have a 3600 rating...it took years and years to work it's way slowly up a few ratings points at a time...exactly the way I said that 5000 is impossible right now while the best engines are 3400.

At 400 ratings points difference the rate of increase slows to single ratings points...so....going from 3600 to 5000 at that rate would take how long? Every draw sends you tumbling back, of course. I understand the system just fine. Move along.

DiogenesDue

Dec 8, 2017

0

#75

Elroch wrote:

btickler wrote:

These results won't mean anything until the games are played publicly and with Stockfish at full strength, with opening book plus reams of hash memory, instead of reducing it to 1GB.

As I illustrated with a graph, bigger is not always better with hash tables, and the ideal size depends on the engine.

...and the main developer for Stockfish said flat out that the hash table size chosen for Stockfish was a severe handicap here. Next.

DiogenesDue

Dec 8, 2017

0

#76

CoffeeAnd420 wrote:

I think the entire situation is hysterical because every top chess player agreed that this is probably going to kill chess and they'll soon be playing another game. Everyone but Nakamura agreed. Naka, considering that there's absolutely no place in this world for him other than in the tiny microcosm of a world known as chess, is obviously on the brink of a nervous breakdown. He's done absolutely NOTHING with his life but play chess and now that chess is dead, he has nothing to show for his life lol. Nobody cares about anything he did on the board.

Just look at Naka. He's one of the most useless creatures on earth.

Actually Nakamura understands engine play better than most GMs. Apparently he also knows not to spout off on something he doesn't know much about yet, unlike the other GMs who gave soundbites. The other GMs' comments were insipid, as most titled players comments are when it comes to anything to do with engines

As for being useless, I'd take a blind wager that he's objectively done more his life than you have...purely based on nothing but your choice of username and displayed attitudes .

Debistro

Dec 8, 2017

0

#77

btickler wrote:

CoffeeAnd420 wrote:

I think the entire situation is hysterical because every top chess player agreed that this is probably going to kill chess and they'll soon be playing another game. Everyone but Nakamura agreed. Naka, considering that there's absolutely no place in this world for him other than in the tiny microcosm of a world known as chess, is obviously on the brink of a nervous breakdown. He's done absolutely NOTHING with his life but play chess and now that chess is dead, he has nothing to show for his life lol. Nobody cares about anything he did on the board.

Just look at Naka. He's one of the most useless creatures on earth.

Actually Nakamura understands engine play better than most GMs. Apparently he also knows not to spout off on something he doesn't know much about yet, unlike the other GMs who gave soundbites. The other GMs' comments were insipid, as most titled players comments are when it comes to anything to do with engines

see: Kasparov.

As for being useless, I'd take a blind wager that he's objectively done more his life than you have...purely based on nothing but your choice of username and displayed attitudes .

Naka is a lot richer than most people on this site. He does NOT need to do anything for the rest of his life. He even knows how to trade stocks.

cats-not-knights

Dec 8, 2017

0

#78

CoffeeAnd420 wrote:

I think the entire situation is hysterical because every top chess player agreed that this is probably going to kill chess and they'll soon be playing another game. Everyone but Nakamura agreed. Naka, considering that there's absolutely no place in this world for him other than in the tiny microcosm of a world known as chess, is obviously on the brink of a nervous breakdown. He's done absolutely NOTHING with his life but play chess and now that chess is dead, he has nothing to show for his life lol. Nobody cares about anything he did on the board.

Just look at Naka. He's one of the most useless creatures on earth.

I'm not so sure to remember what I mostly heard in the last 24 hours but... I guees I heard somenthing about that this result won't change a lot for human chess...

by the way you speak I assume you Knows nakamura very well and very closely, mind to get an autograph for me?

cats-not-knights

Dec 8, 2017

0

#79

Debistro wrote:

btickler wrote:

CoffeeAnd420 wrote:

I think the entire situation is hysterical because every top chess player agreed that this is probably going to kill chess and they'll soon be playing another game. Everyone but Nakamura agreed. Naka, considering that there's absolutely no place in this world for him other than in the tiny microcosm of a world known as chess, is obviously on the brink of a nervous breakdown. He's done absolutely NOTHING with his life but play chess and now that chess is dead, he has nothing to show for his life lol. Nobody cares about anything he did on the board.

Just look at Naka. He's one of the most useless creatures on earth.

Actually Nakamura understands engine play better than most GMs. Apparently he also knows not to spout off on something he doesn't know much about yet, unlike the other GMs who gave soundbites. The other GMs' comments were insipid, as most titled players comments are when it comes to anything to do with engines

see: Kasparov.

As for being useless, I'd take a blind wager that he's objectively done more his life than you have...purely based on nothing but your choice of username and displayed attitudes .

Naka is a lot richer than most people on this site. He does NOT need to do anything for the rest of his life. He even knows how to trade stocks.

what about rooks?

and minors?

Lyudmil_Tsvetkov

Dec 9, 2017

0

#80

Elroch wrote:

You seem to misunderstand some things: there was no chess knowledge being used: AlphaZero worked all this out for itself. Secondly, AlphaZero was about 100 Elo points weaker with only 1 second per move - equivalent to making its hardware slower than that used by Stockfish - so it might still have won (with a closer margin) without using the TPUs that google designed for general purpose neural networks. (It is great that they have done this: parallel matrix processing will be such a boon for demanding mathematical computing!). But more important to which is really the best program is that if Stockfish had 30 times more time per move (equivalent to speeding up its hardware) it would not have got much stronger (according to my extrapolation of the graphs).

You are wrong on all 3.

There was chess knowledge to start from, of course, they trained the opening on countless GM winning games. Is this no knowledge?

I am certain they had also at least psqt and piece values, but they don't acknowledge it officially.

1s. per move with SF also having 1s., that is a big difference.

The 30 times more time test is equally valid for both, so not a proof for anything.

Oops, I got tired posting one and the same stuff and trying to convince mentally entrenched people.

In a year's time, you will be convinced.