Objectively Speaking, Is Magnus a Patzer Compared to StockFish and AlphaZero? - Chess Forums - Page 6

SeniorPatzer · 2017-12-06T10:30:01-08:00

I've been reading threads about StockFish being dethroned by AlphaZero. Then I clicked on the London Chess Classic for the first time (see that's why Chess.com is best) and lo-and-behold, I heard Yasser exclaiming to Jennifer Shahade about AlphaZero destroying StockFish which had just won

Lyudmil_Tsvetkov

Dec 12, 2017

0

#101

And that is why I don't like the approach, because it is too simplistic. The primary code is very simple, as it seems. What is complex is the tuning network, but that is just hardware.

Chess is much more complex than that, theoretically and that is why Alpha will not make big progress in the future.

Elroch

Dec 12, 2017

0

#102

Lyudmil_Tsvetkov wrote:

The team includes at least 3 chess programmers. Matthew Lai, the author of Giraffe and Talkchess member, is one of them. It is maybe for a reason that Giraffe, following the very same approach as Alpha, is rated only around 2400 on single core.

So, what you are saying is that AlphaZero is 3600 because they have someone on their team who has created an engine that reached 2400?

Likewise, the team has no chess player at above amateur level.

However, the reason Matthew Lai is on the team is that he had tried to produce a chess AI, just one that was 1200 points weaker. 1200 points is not a difference that is achievable by speeding up hardware, even a lot. From articles on this, he was using a much smaller neural network, which even on slower hardware was able to look at about 10% as many nodes as a conventional engine. (see this article)

However, I would agree that using modest computational resources would have been a huge barrier to the development of AlphaZero. The most demanding phase is the self-learning, and this would have taken months without the fast hardware, rather than 4 hours.

The reason AlphaZero benefits from more computation when playing is simply that its search tree gets bigger. But this search tree had 1000 times fewer nodes than that of Stockfish with the exact hardware each used!

It is the huge hardware that made the difference and not the approach.

Self-learning, self-learning, what do you mean self-learning and AI.

You admit you know nothing about the techniques that AlphaZero used to generate its strength: model-based reinforcement learning, termed "deep reinforcement learning" because the model used is a deep neural network.

I have studied this subject (including watching David Silver's excellent lecture series), and use Sutton's book on the subject.

It is true that AlphaZero uses a lot of processing power to achieve its highest strength in head to head play. However, with 30 times less power it would remain the highest rated engine according to AlphaZero's testing. While restricting AlphaZero's computational power would make the match closer, increasing the computational resource for both AlphaZero and a conventional engine like Stockfish would greatly advantage AlphaZero.

The key reason AlphaZero increases in strength more rapidly with computational resource appears to be that the branching factor of its search tree is smaller, to an extent which compensates enormously for it looking at far, far fewer positions. With the full power of 4 TPUs, AlphaZero was still looking at 1000 times fewer nodes than Stockfish! If its time was reduced by a factor of 30, it would be looking at 30,000 times fewer, and still be stronger!

As a result, when AlphaZero gets more time, its horizon must expand significantly faster than that of Stockfish. This is why not only is it stronger, it also indicates this technology has a permanent edge.

You made a good point that I should emphasise: the role of those with knowledge of conventional chess engines in the design of the tree search algorithm of AlphaZero, which has some commonality with all chess engines. While I am no expert on chess engines, a key strength of AlphaZero is that it is better at allocating resources to different branches, and that this is achieved by the quality of the neural network's estimates of the probabilty that each move is best, which comes entirely from self-learning.

coldgoat

Dec 12, 2017

0

#103

stockfish does not have to wear glasses to crush you

SmyslovFan

Dec 12, 2017

0

#104

I do not believe that AlphaZero is able to perform better than 3600 strength. I believe that because that is, in my opinion, a close approximation of perfect chess. Stockfish was severely handicapped, so AlphaZero's performance rating can't be calculated. We don't know how strong Stockfish was during the match. My guess, and it's only a guess, is that it was around 3000 strength. It was still very strong, stronger than any human, but it was beatable. And it would have lost a match to a fully fit Stockfish fairly handily too.

Elroch

Dec 12, 2017

0

#105

I am surprised you would make such a wild claim! Stockfish was running on the most powerful computer I have ever heard of it running on - 64 cores - far stronger than are used in most computer competitions. Given that the hash table was a reasonable size (exactly how near to optimal it was is not clear), the effect of the wrong hash table is the same as a fairly modest shift in the CPU speed and Stockfish's rating was changing very little with CPU speed at that level, it is unlikely that this caused many tens of Elo points harm.

As a result, we can be pretty sure the absolute rating of Stockfish's play was as high as its usual rating.

There is the issue of opening book, but in the period since human openings have been less and less useful to computers which are more than 600 points stronger, this too is an efficiency issue. Modern computer opening books are basically the product of computing time in previous computer games, or self-play.

I would point out that selecting an opening book is a different game to playing chess, and it is a someone dull one, since it is about generating a static book and following it by rote. AlphaZero devoted no time to this. However, when playing Stockfish in the full range of best openings on a second equal playing field (both being assisted or hindered equally), it was much stronger than its adversary.

admkoz

Dec 12, 2017

0

#106

Elroch wrote:

admkoz wrote:

Elroch wrote:

admkoz wrote:

Elroch wrote:

admkoz wrote:

What I am curious about is whether it "figures out" things like "don't give up a free queen", or does it really just have to figure that out again every time such an option presents itself?

From there its experience improves these networks and after a while it would learn that positions where there was a queen missing tended to not have such as good an expected result. Well, actually it would get a general idea that more material is better[...]

I have put this crudely, but basically a big neural network learns to encapsulate concepts that can be very sophisticated[...]

So you're saying it DOES figure out that "more material is better" meaning that it can evaluate positions it has never seen before on that basis.

You and me can glance at a board, see that there are no immediate threats, see that Black is up a rook, and figure Black has it in the bag, even if an actual mate is 30+ moves away. We'll be right 999,999 times out of a million. Can AlphaZero do that?

We would not be right that often.

But yes, based on my understanding of the technology, it's positional evaluation network would be so good that without any explicit analysis at all it would play quite good chess. I am not sure how good it would be in this mode, but I do know it needs to do analysis to play at better than 2900 Elo (as it achieved near this level using about 1/30 of a second per move and got better as the time increased).

So what percentage of the time DO you think being up a rook in an otherwise normal position, in a game between > 1500 players, is a win? That is just a quibble.

OK, so AZ would do pretty well even if it was not allowed to do any further analysis. That implies that AZ can evaluate any position, and it learned to do this solely by playing (initially) random games.

I guess it may be that this is the kind of question that can't be answered in a blog post, but what I am trying to figure out is the form of that evaluation method and how it gets built.

The nature of the evaluation method is quite simple. It has some sort of representation of the position as an array of numbers which are the inputs to the neural network - the neural network doesn't know what they mean, it has to work this out from they way the relate to the results of games and to their values on other moves - and a large deep neural network with thousands (not sure how many thousands) of nodes in many layers which take the representation of the board and output a number, the expected score from the position. [I hope I haven't missed some published detail].

How the evaluation method gets built comprises two parts (if my understanding is correct - I am supplementing what is published with general ideas about deep reinforcement learning). The obvious one is when a game ends: the exact value of the position is available, and that can be used to adjust the neural network to improve its evaluations of earlier positions in the direction of the right result. The second one is that when it evaluates a position, if this evaluation is a surprise compared to the evaluation of previous positions, the network is tweaked to make the evaluations of previous positions a bit more in agreement with the later evaluation.

The first form of feedback is basically making the evaluation compatible with the absolute value of clear positions. The second form of feedback is basically making the evaluation compatible with the legal continuations in a position: the reason is that the perfect evaluation of a position is the same as that of a later position reached by perfect play.

So, does that mean that there exists a function F(position) = Value which does not depend on 20-deep evaluations of the possible moves from the position? If so, it would be kind of awesome if Google was to publish what that function was at the current state of AZ. I am sure it would not be human readable - and just as sure that humans would be able to pick key things out of it.

SmyslovFan

Dec 12, 2017

0

#107

Elroch wrote:

I am surprised you would make such a wild claim! Stockfish was running on the most powerful computer I have ever heard of it running on - 64 cores - far stronger than are used in most computer competitions. Given that the hash table was a reasonable size (exactly how near to optimal it was is not clear), the effect of the wrong hash table is the same as a fairly modest shift in the CPU speed and Stockfish's rating was changing very little with CPU speed at that level, it is unlikely that this caused many tens of Elo points harm.

As a result, we can be pretty sure the absolute rating of Stockfish's play was as high as its usual rating.

There is the issue of opening book, but in the period since human openings have been less and less useful to computers which are more than 600 points stronger, this too is an efficiency issue. Modern computer opening books are basically the product of computing time in previous computer games, or self-play.

I would point out that selecting an opening book is a different game to playing chess, and it is a someone dull one, since it is about generating a static book and following it by rote. AlphaZero devoted no time to this. However, when playing Stockfish in the full range of best openings on a second equal playing field (both being assisted or hindered equally), it was much stronger than its adversary.

I defer to your understanding of computers, but not to your understanding of chess. An opening database is of tremendous assistance to computers. Humans could still beat engines that didn't use an opening database about a decade ago. The reports I read also suggest Stockfish didn't have access to an endgame tablebase either.

You may know quite a bit about computers, but you are wrong to argue that opening databases and endgame tablebases don't materially help traditional engines such as Stockfish to play better chess.

Elroch

Dec 12, 2017

0

#108

I don't disagree with you, and I am not bad at turn-based chess (although not as good as my ranking of #95 on chess.com suggests), where I understand the usefulness of a database.

As I pointed out, the advantage of an opening database is essentially a time saving: moves in an opening database are the result of previous computation (by people originally, but computers have become more important). You can use better players to play your moves in the openings, but if you are the best player in the world (like Stockfish or AlphaZero), the only advantage is that you save yourself some computation.

As such it is a bit of a cheat when comparing engines. The best opening book (so huge it is impractical) would allow a lousy engine to play well!

You need to separate the two functions of a computer: finding good moves in a general position, and reading good moves out of a database. The former is more what DeepMind were interested in.

Elroch

Dec 12, 2017

0

#109

admkoz wrote:

So, does that mean that there exists a function F(position) = Value which does not depend on 20-deep evaluations of the possible moves from the position? If so, it would be kind of awesome if Google was to publish what that function was at the current state of AZ. I am sure it would not be human readable - and just as sure that humans would be able to pick key things out of it.

There is, and it is a number that you get out of the AlphaZero neural network when you provide a position as inputs. My guess is that by just doing the evaluation for the position after each legal move it may be capable of master level play (the data implies only that it is below 2800 without doing analysis). Unfortunately, this network consists of literally millions of parameters used in calculations at considerable depth, so not too easy for humans to unravel. Our brains are even worse at up to 10^15 connections, though.

SmyslovFan

Dec 12, 2017

0

#110

Again, you are making a critical mistake from a chess player's perspective.

The opening database doesn't just give tactical short cuts, it provides positions that are playable. All of the main lines have been analyzed out far more than 40 ply in critical lines, and the resulting positions have been played countless times, by humans and machines.

Opening databases don't just provide a "short cut", they provide a platform for reaching playable positions. The requirements of the opening are different from the requirements of a general middle game, and vastly different from those of an endgame.

Gerberk8

Dec 12, 2017

0

#111

I don t like Magnus Carlsen at all...He is a conceited twat who does not know anything but chess...Compared to Kasparov he is an idiot all the way...

Lyudmil_Tsvetkov

Dec 12, 2017

0

#112