Alpha Zero vs Alpha Zero! 10 lesser known Alpha Zero lines!

Path2GrandMaster
After exhaustively scouring the internet for all I could read about Alpha Zero or AZ, I noticed every discussion, blog or youtube video covered the 10 games we have all come to know.  The publication revealed so much but yet so little (not a single c4 game was included, yet AZ preferred 1.c4 as first opening move)  The chess community has been drooling for more AZ games ever since.  
 
Yet 10 other opening game snippets were also posted, the principal variations, PVs of Alpha Zero in the AZ vs AZ training matches.  The PV is the equivalent of Alpha Zero's opinion of the best opening moves for both players.
 
For all those want more insight into Alpha Zero, myself included, now have a place to view, analyze and discuss these 10 lesser known opening lines which are the conclusion of Alpha Zero from the AZ vs AZ training games! 
 
Please share your thoughts and insights into these 10 lines and what the players must be thinking about to play such lines.  
 
Here is a compilation of all 10 variations...
 

 

 

ESP-918

I thought the first move it preferred was D4, not c4.

Path2GrandMaster

Here is a clip from the released publication .  The red line shows AZ's favorite opening was 1.c4 English from the get go.  Furthermore both the end preference is higher and the volume of games (area in graph) is by far the largest as shown at the link and clip below.

 

null

 

Thanks for the input, please look deep into these lines, I did and will be periodically sharing what I have found with hopes of a collective effort to understand what Alpha Zero understands!

 

The openings chosen over the training period reflect AZ best play versus its best self.  The change in selection of opening moves over training time show AZ’s current algorithms “opinion” of each opening.  It is vital to note that AZ’s opinion of the openings does not especially reflect the results of the match vs stockfish (as in the French defense case, where AZ thinks rather low of the French yet whoops SF in French games).  The results of the AZ vs SF matches do not reveal quite enough though...

 

However, the preferrences of AZ in training for the different openings does actually reveal quite a bit and the PV (principal variations) reveal quite a bit, if we would spend time studying these lines!

  • During the beginning of AZ’s training it chose the English opening 5% of the time,
  • The 10 most popular openings were chosen 35% of the time by the eighth hour of training.
  •  Alpha zero strongly preferred the English opening from the first training games!

 

LouStule
This Dude sounds like a Russian bot.
Path2GrandMaster

Lou, lets focus on the understanding of the lines above.  

CID64

Ok,is clear for white. But for black ? AZ prefered Ruy Lopez and Slav with 4...a6 ?

Path2GrandMaster

CID64, thanks for the interest! 

 

When I first saw the Alpha Zero games they deeply resonated with me.  Since, I have enjoyed pouring over them.  This has been very revealing and I think so many are missing out on so much if they haven't spent much alot of time with those 10 games.  I can assuredly say I have insight into Alpha Zero that no one else has mentioned on the web.  From this place of insight, I noticed the great number of misconceptions about the data released in the paper...

 

For instance, the algorithm of best move changes depending upon the opponent, as the most likely move is efficiently predicted.  In other words, in the unique and unknown lines above AZ plays differently, since it is playing itself (AZ vs AZ)!  In the matches vs SF however, AZ chooses different moves its best play is based upon the opponents most likely move.

 

What does this mean?  It means the lines above are PERFECT according to AZ, but the lines AZ played against SF was best against SF.  Remember, rather than finding the true miniMax (impossible btw) alpha zero used a predictiveMax.  For instance, as white against french, AZ choses Nc3 over Nd2, this is very very revealing, and is not based upon the minMax principle.  

 

There is so much to be said and I really hope for future discussions to reveal more!  What this seems to reveal is minMax in itself IS drawish, whereas predictiveMax utilized by AZ is "winnish".  AZ's method exploits the weaknesses the minMax method, of computers and of humans. 

 

With that said, the PV's above are the best lines, although they are presently only understood as this by AZ and the few individuals who run across this post.  

 

So, I do want to focus on the 10 new lines above (AZ vs AZ), with the understanding that they are even superior than the moves played in the (AZ vs SF). 

 

Just imagine, there are 10 lines of AZ which nobody pays attention to, yet they are the Principal Variations of Alpha Zero.  Soon I will post each line separately for easy viewing and discussion.  Until then use the game  board above....Please share your insight here as I will!

congrandolor

Interesting, are there any opening novelty in AZ games (never played by human players according to databases) which could be played in human level, or are they too complicated?

drmrboss

nullNo impression on Google's sponsored ads A0' performance, against severely restricted SF ( 1GB memory hardware is similar to my HTC HD2 in 10 years ago). Why I say like that, SF played Qe1???? in move 8. May be handicapped SF.  Download any version of SF from SF 8 to SF 9 ( full strength SF will never consider such a weak move)

pawn8888

One thing about computers that is interesting is that their games are always different like human games. It seems like about the 7 - 8 move human and maybe computer - kind of takes a guess.

CID64

@drmrboss

The result between AZ and S isn't important to me.Neither the games.I am interested by the games played by AZ with AZ !

Path2GrandMaster

Thanks CID64 for seeing the point of this forum. 

The topic here is sharing insight into the 10 hidden or lesser known lines of AZ vs AZ (as these are MORE significant without the influence of unknown dynamics of the AZ SF matches)

 

I will shed a little more light for the readers regarding these games. 

  • Each of the 10 lines are connected by a single idea, this same idea is also prevalent inside the 10 known AZ games.
  • AZ's variance in PVs (best lines or ideas) throughout the training process is due to finding different local minimums (great ideas for specific situations) but finding out that idea (current local minimum) is not so great in other positions.  The progression of AZ's local minimums show a beautiful picture of the process of learning chess.  The goal of every chess player and AZ is to find the true global minimum, or best idea in all cases, aka the most important factor)
  • AZ's preference for 1.c4 throughout the whole training process is a revelation waiting  and worthy of a focused discussion in itself.  (This forum post is the only place online discussing actual 1.c4 variations played by AZ, since none were included in the known 10 AZ SF games.)

 

I cannot let the cat completely out of the bag, but wanted to get some people looking in the right direction.  If you want to improve beyond Carlsen, and even beyond Stockfish you may want to look deeper into Alpha Zero.  Would a human understanding of what Alpha Zero found not give one such an advantage?  AND since AZ cannot actually understand itself, would not an actual understanding of that which Alpha Zero pointed to, the true global minimum, not give one an advantage even over AZ? 

 

( I have been writing a  book on chess whose purpose is to share such revelations which the world of chess.   You can simply follow me on chess.com to get first dibs on the book!)

 

However, until then, I have shared a little bit of a very important topic to stir discussion upon an VERY important subject for the benefit of each fortunate reader.  

 

So where does a willing individual begin such a task as understanding AZ?  How can one go about deriving important information from such lines?

 

Lets begin here, focusing on just a single line...

 

 

 

The line posted above was considered by Alpha Zero to be the strongest and best continuation for both sides out of all chess openings.  The ideas behind this line is understood as the best play of both sides and the global minimum. 

 

One of many particular and interesting features is the level of contempt for both players (they are both AZ) found in the line above. Contempt can be seen as a disregard of sorts, or like a type of ignoring the plans of the opponent, but it is usually associated with risky play.  However the type of contempt I am referencing in regards to the above alpha zero line does not have that risk factor and is therefore not that kind of contempt, rather it is a type of follow through of ideas allowing the least influence from "what if" scenarios of the opponent, thus ignoring his threats, unless they are long term positional threats, rather than material based elvaluations of variations. 

 

It seems AZ allows utilizes a hybrid of the minMax, which I call the predictiveMax, but it could also be called the "Maxi-min" or something else entirely.  The idea is to ignore the opponents best response move in the opening, IF the prepratory move to the opponents best response loses the real current positional advantage. This in turn allows for a 95% random probability that the best move will not be chosen, since there is only one best move in these cases, even though it may seem unclear (even to SF) like there are many options.  A good example is in the french defense line, the critical point is found on wAZ's 3rd move, 3.Nc3, seemingly enticing black to play Bb4 variation.  Here SF and other engines concur at deeper depths that Nd2 is the best move, yet AZ vs AZ returned to this critical variation time and again and AZ in conclusion AZ discarded the idea of Nd2.  This is important because to play 3.Nc3 you must ignore the threat BECAUSE it is more of a threat to play in such a way as to make a move which is 100% concession for the reason of a probable possibility (like Nd2 in view of Bb4 ), than to have a mere 5% chance of concession with Nc3, Bb4 line.  The point is minMax causes one to concede even for the future possibilities, when concession should be forced, not volunteered.  Its the Maximum principle, before the minimum, allowing the opponents possible threats to influence your maximum potentially minimally.

 

 

  1. The flexible 1.c4! reveals so much.  Yet it does not reveal so much of white Alpha Zero's (wAZ's) central pawn structure (ie white has chosen to remain flexible).  Black can use this information for planning as it reveals that white will almost assuredly castle to the kingside.
    • Black AZ responds with the less flexible 1.e5! It is surely the critical and most natural test against 1.c4! 
  2. White Alpha Zero choses the modern 2.g3! kingside fianchetto.  Here we can see a strong style beginning to immerge for the clearly positional player wAZ.  What is going on inside of AZ to make it choose 2.g3! over the large majority of its 1.c4 games?  It is clear AZ is stressing the value of king safety, and does not hide his plan to castle kingside early.  
    • 2...d5! Wow, Black Alpha Zero (bAZ) choose 2...d5! This should be such a revealing choice for us to analyze.  Here black continues with his contemptous plan, ignoring the threat of the possibility of cxd5, trading the center pawn for a semi-central pawn.  Why?  Interestingly enough any other ideas by black would be delays of concession by way of d6 or dxc4, rather than play with a disadvantage, bAZ MUST play in this gambit styled approach, confronting the worst of possibilities early on, and putting white to the test as well (since much of 1.c4 is to stop d5 from being played for fear of concession of central pawns)
  3. The most obvious, correct and best reply, 3. cxd4 is decided best by wAZ, who has already achieved his original objective by force, converting the intial tempo advantage into a long term positional advantage.  This was achieved against black AZ best play, (any other moves would have been more disadvantageous for black)

to be continued...I ran out of time, but I will post more on this as others show interest.   I really did the above post as an example for others to begin doing the same here.

 

Cheers!

 

 

LouStule
Have you won any games playing this opening? If so, please post so we can see the continuation Thanks.
jonathanpiano13
I wonder which strategy bAZ would use after 1.c4 e5 2.Nc3
Path2GrandMaster

LouStule, posting a human game or computer game with SF, using the above lines would not help anyone since it would be chalk full of undetectable sub-par moves.  Please note that the above line does not need my addendum of won games from that position.   According to AZ, it is the best idea.  The point of this forum is to understand what values AZ holds to play such moves.

 

@jonathanpiano13, thanks for the interest but perhaps your question should be why is 2.g3 better than 2.Nc3.  I could give a lengthy explanation, but first I ask the readers to give their reasons why 2.g3 is better ...

 

happy.png

Macheeide

All this is a revelation and an inspiration. Thanks to all. Will all the AlphaZero vS Stockfish games ever be published? Why are they a secret?

Oh, BTW, Rybka has a Monte Carlo analysis feature which I have found useful in the past. I don't know whether any other engines have/had this feature. I tend to think not.

Path2GrandMaster

Thanks Macheeide, very few people know about this thread and even fewer understand the revelation which is under the surface.  I was hoping others would share deep insights into these unknown lines which are more relevant than the AZ vs SF matches...Please do note that these lines are NOT AZ vs SF but even more solid as they are AZ vs AZ!!! happy.png Cheers and have fun analyzing (please do share)!

 

pawn8888

I don't know if it makes much sense in having a computer play itself, since it would know which moves it would make in the same situation. How could a computer, playing white, lose to itself? That would be an interesting game. Like the computer was playing a joke on itself.

Macheeide

Makes me think of Kramnik. The 14th World Champion habitually plays both the Catalan and the Berlin Defence if I'm not mistaken. Note to self: I must revisit Kramnik's games.

stuartnaylor

The book by Matthew Sadler, Natasha Regan looks really interesting.
Game Changer: AlphaZero's Groundbreaking Chess Strategies and the Promise of AI
https://www.newinchess.com/game-changer

There are some vids on Youtube that are extremely interesting and how AlphaZero holds position higher than material.
The neural network computer layout as apposed to traditional CPU brute force methods maybe not unsurprisingly makes for an extremely exciting 'human' game.