Ok, so make a firm assertion I can argue against. They were private games? Ok, I agree.
Google manipulated it in such a way that the weaker engine won? I disagree.
Exactly.
Ok, so make a firm assertion I can argue against. They were private games? Ok, I agree.
Google manipulated it in such a way that the weaker engine won? I disagree.
Exactly.
I haven't seen any of the games played, so I don't know what happened. How did AlphaZero win? (mate, time, etc.)
Ok, so make a firm assertion I can argue against. They were private games? Ok, I agree.
Google manipulated it in such a way that the weaker engine won? I disagree.
Exactly.
It really doesn't matter if you disagree with #2, if #1 is true. If the games are private, we can argue til kingdom come...that's the whole point. There will never be proof until A0's team does it right, ala Leela.
While the critics of the match logistics have an excellent claim, the larger point for me was the beautiful way that AlphaZero won its games. They were incredible.
Could the Google DeepMind team have AlphaZero win the same way against a fully hardware optimized, gigantic hash table, opening book, and everything else that the Stockfish development team wants?
Who knows? But it would be fun to see!!
The second match provided a 32Gb hash table, opening books and the same fast hardware as TCEC (44 core). It was only slightly different to the first match. Any further changes would have less effect.
Stockfish' best shot might be to use intensive computing power to develop a drawing book and hope it could do well enough to hang on against the superior player.
the larger point for me was the beautiful way that AlphaZero won its games. They were incredible.
Although... out of (IIRC) 1300 games (there were open specific matches totaling 1200 games) they only released those 10 wins. Sure some of them were amazing, but we saw <1%
We should remember A0 only scored ~65% which is only good enough for a 100 Elo gap. Sure they showed us some spectacular wins, but most games were drawn.
The larger point for me was that neural networks are interesting and a little scary. Remember A0 is not a chess engine, it was also the best at go and shogi.
It's amazing that these neural networks (more information available on Leela) are known to be not very strong tactically, but hardly suffer (because they don't tend to get into situations where they are at risk of a tactical coup, or perhaps needing one in a timely fashion). Exactly the opposite of the early engines (only way stronger). Any weaknesses left in the top engines are all positional in nature.
This might shed some light on AlphaZero's rating (Skip to 2017).
https://www.youtube.com/watch?v=dgwPK3HKTgI
This confirms the fact that others had to convince me of: AlphaZero, brilliant as it was, has been clearly surpassed by the recent best. Stockfish 12 was a huge leap due to the NN enhancements and Stockfish 13 is a little better (and also by LC0, an open source project inspired by AlphaZero).
Hopefully it will be a bigger advance than 13.
Now I infer that the video was showing blitz ratings, like those given at https://www.sp-cc.de/
As such, they are not as impressive as I thought. The standard time control ratings would be significantly lower, due to the (empirically) narrower scale. A practical advantage of blitz ratings is that you can get a lot more games played. The lower percentage of draws might be viewed as a plus but its the net wins that really matter.
It's a knockout tournament involving matches. Rather like the later stages of the FIDE WCC and somewhat better at determining the true best player but with some luck when players are of similar standard.
While it was tenable to argue that the first match conditions handicapped Stockfish enough to make it unclear it had been beaten (I did not believe so: I thought the result would prove genuinely indicative of relative strength), the later match answered all those questions and confirmed AlphaZero is significantly superior (remember it won even with 1/10 as much time as Stockfish).