Leela Zero( A Neural Network engine similar to Alpha Zero)

Sort:
drmrboss
MitSud wrote:
This is scary, an AI learning online by playing against humans, imagine when it hits 2800, can’t be more than a few months away when it can increase its strength by 250 rating point within a week.

That play.lczero.org  doent learn from human games. It is impossible to learn from random opponents. Leela already had 1.5 million self-play trained games. She can learn by playing against herself(exactly similar opponent). How she learn.  In first game 1000 games, she will play f4, and another 1000 games she will try e4. And compare the winning percentage. If e4 gets more winning scores, she will prefer  e4 than f4. In random opponents, let us say she play against 3500 stockfish, whatever move she make, she lose. She dont know which one is stick vs carrot. And the same, if she play against 500 rated opponent. whatever move she make, she win. She may not know which move made a difference(win or lose). She cant learn stick vs carrot again.

solskytz

This program is definitely not anywhere near 1500-1600, or even 1800-1900. 

It's so difficult to draw it already!

Here's my most recent effort, after some 6-7 very humiliating losses (including the one above)

Of course I play fast and stuff, but still - no 1900 would deal me a 7.5 - 0.5 score, no matter how fast I would play. 

 

drmrboss

Oh,man, it took me about 30 mins to carfully check all lines while Leela play instant move or  200ms/move  Play.lczero.org , network ID 82, . As usual strat vs computers, " Block and Squeeze"nervous.png

 

drmrboss

I had one pawn plus one minor piece extra = +5 material advantage, but really struggling vs her positional play, piece location and mobility.

solskytz

Amazing. 

I wonder if it's thanks to the 30 minutes thinking time and painstaking care, or thanks to your "anti-computer" strategy - because we know that these strategies are generally recommended against "traditional" rather than self-trained engines. 

A very good effort, of course!!

Godeka
drmrboss wrote:

It is impossible to learn from random opponents.

 

It is exactly what was done before: train a network with a large collection of games. The LZ project has a NN that was trained with human games and is used as a reference. Even AlphaGo was trained with human expert games, and additionally it did reinforcement learning from self-play games. Learning from human games also allows to control the playing style or creating NN of different strengths with a very human-like style.

 

And usually you want to have a wide tree with variations, therefore some randomness is added to the opening moves. I don't know if LCZ does it too, but I would be surprised if it does not.

vonderlasa
I am 1854 USCF. Leela currently beating me 8-3, so estimated rating would be 2036 for Leela.
DiogenesDue

Nice to see projects like this.

solskytz
vonderlasa wrote:
I am 1854 USCF. Leela currently beating me 8-3, so estimated rating would be 2036 for Leela.

 

I suppose that Leela would have a different rating for 1-min, 3-min, 5-min, 15-min, 30-min, 90/40 + 30, etc.

 

When playing Leela without a set time control (you think as long as you want), Leela's perceived "rating" (or level) will probably be a function of the time that you are willing to invest when playing it...

drmrboss

Leela vs NM Jerry aka Chess Network.

https://www.youtube.com/watch?v=6CLXNZ_QFHI

drmrboss

How A0 @ Leela 0 train themselves.

https://medium.com/applied-data-science/alphago-zero-explained-in-one-diagram-365f5abf67e0

Elroch

Excellent link!

Godeka
petrip wrote:
Godeka wrote:
drmrboss wrote:

It is impossible to learn from random opponents.

 

It is exactly what was done before: train a network with a large collection of games. The LZ project has a NN that was trained with human games and is used as a reference. Even AlphaGo was trained with human expert games, and additionally it did reinforcement learning from self-play games

emphasis on expert games. AlphaGo was trainded on pro andt top amateur games. It was good but still over 1000 Elo points weaker than self play. Learning from games again average players would give horrible results

Also in backgammon NN trained on pro-games was "weak" the one to be as good as one can be in backgammon was fully self-play trained

 

It depends on what your goal and possibilities are. For example it is fairly common to use multiple NN of different strengths for a human-like style from beginner to pro. Or if you want to train a network on your own PC the only issue is how to get a large number of games. Getting collections from game servers was almost the only way to get games. Or if you want to create a strong engine with limited resources, it can be faster to start self-play training with an already trained network.

 

AlphaGo Master by the way was only two or three hundred ELO weaker than AlphaGo Zero, and this difference is measured by engine vs engine games and is not very accurate. Maybe the more impressive part of AlphaGo Zero was that it trained itself in a much shorter time.

drmrboss

Omg, 0.2 million games were trained in yesterday by 300 volunteers! (will get 44 million games in 200 days)

null

congrandolor

Exciting experiment, when is it estimated Leela would beat a strong engine?

benpaulthurston

I'm not even a 1500 on chess.com but I can still beat it about 1 in 3... just on normal though. it's at build 82 right now for posterity...

drmrboss
null
Yenny-Leon

Can it improve by playing copies of itself?  (maybe copies with different "styles" derived from different training samples?)

 

If this monster keeps growing, watch out!  

drmrboss

Yes, after several thousands of games, original neural network (network A) got feedback from outcome of self played games. NN take feedback from win ,draw, loss patterns and other weight adjustments. With these new adjustments, there will be a newer network (network B). Then there will be a match between A vs B, if B win A, B will be the newer master network. But sometimes, B may fail vs A, then the newer network will be rejected.

By this way, only stronger and stronger networks will be carry over for the future training.

drmrboss

But there are some limitation on NN, they are very poor at calculation. Same as human brain vs calculator. A guy from talk chess test position on SF vs Leela.

SF 9 need 0.01 sec vs 60 sec for leela. SF 9 can solve this problem about 6000 times faster than Leela.

Conclusion, to see the real value of Leela, she will need massive amount of time or need massive hardware like A0.

null