Today Leela becomes generation 14, (a game vs me was played vs generation 12). Leela is fearless, aggressive without caring king safety.
Nice. g4 was a surprise that turned out well!
But what was it up to at the end? I believe this could be fixed by encouraging it to win quickly in the design of the reward calculation. This is very straightforward if it is only told about the value of a draw or a loss: score a win on move N as gamma^N, a loss on move N as -gamma^N and a draw as zero. where gamma is (of course) less than 1,
I know little about the details of the design of the system, but it certainly looks as though it gives no value to winning more quickly. Any discounting would nudge it towards being efficient, which might also provide a better return on computational resources for learning and playing.
Btw, if you would like to contribute in testing the Leela, I can help you set up your machine, we need a lot of volunteers. (Same as Stockfish project)