Google Deep Mind Rating

Sort:
drmrboss
Elroch wrote:
drmrboss wrote:

SF strength is heavily dependent on hash size.

Show us the data on the dependence of Stockfish's Elo performance on hash table size. Guessing does not suffice.

I saw Kai Lasko's post in talk chess in last year that he already tested. AFAIK, according to his tests , SF performance was significantly reduced in deep mind setting vs optimal setting ( may be 50- 100 elo, I did not remember exactly)

May be you can ask him directly.

P.S, people who have decent knowledge in computer programming know that restricting hash size significantly reduce an engine's performance. How much elo exactly is always difficult to determine because elo varies a lot between testing conditions and statistical margin of err bar.

Elroch

The later match gave Stockfish opening books, free control of allocation of time to moves and 32Gb hash. The Elo difference was only slightly less than the first match.

Some of you guys may need to wait until next year when LeelaZero defeats Stockfish 11/12 in TCEC.

drmrboss

The good thing about AI is that Leela surpassed Alpha Zero Level.  In case Leela and A0 were running on equal hardware, Leela would still be +10 or +20 elo stronger than alpha zero now. ( based on simulation of Leela vs Stockfish 8 in 100 games match results in TCEC, and other test results,).

There is no doubt current Leela running on chess.com 4 GPU machine ( running 100 knps ) is much stronger than Alpha Zero in 2017 with 4TPU (80 knps). And also current Leela in TCEC is leading vs 44 cores latest SF, which is about +140 elo stronger than regular SF 8 or +200 elo stronger than Deepmind's restricted SF 8)

DiogenesDue
Elroch wrote:

The later match gave Stockfish opening books, free control of allocation of time to moves and 32Gb hash. The Elo difference was only slightly less than the first match.

Some of you guys may need to wait until next year when LeelaZero defeats Stockfish 11/12 in TCEC.

You seem to be missing the point.  If Leela wins, awesome...I want AI engines to get better and eliminate the dinosaurs.  If Leela *wins*, though, it will be because Leela *competed* on a *level playing field* controlled by a *3rd party*.  Alpha Zero cannot *win* via private testing.  I'm all for Alpha Zero...it's the way everything external to Alpha Zero is being manipulated to ensure that Alpha Zero cannot lose...it's like a toddler whose parents protect them from every possible harm, and stunt their growth thereby.  They already proved they will spend way more money promoting Alpha Zero than it would take them to compete fairly with Alpha Zero.

P.S. Yes, I know that Alpha Zero cannot compete on TCEC hardware, but they could still set up an official match with verifiable results, with all engines running on the best hardware available and with all settings "optimal" for *each engine*.  I'm sure that Chess.com would *love* to host such an event...and I'm sure Google would be fine with it...once Alpha Zero reaches a 99%+ win rate against Stockfish straight up.

DiogenesDue
drmrboss wrote:

The good thing about AI is that Leela surpassed Alpha Zero Level. 

Equally unprovable since Alpha Zero lives its life in Google's closet wink.png...

carlos10410
Hjh
DiogenesDue
petrip wrote:

No you are missing the point. Google tested they algorithm not arranged the match. To gain information mostly for them . No one gives rats ass about match and whether A0 was dash stronger or bit weaker. Conditions were good enough that you can say that approach is viable. Also papers presented were detailed enough that effort could be duplicated. As it was duplicated. And results have been now verified by third party leela-0-chess is essentially tad weaker clone of A0 due lack of trainign. But it will get there.  

And no 99% was not reached before and will not be reached. From published paper it was obvious that the learning rate had stalled and further gains would be very small 

Your point would be valid, if Google had in fact just used the information for testing and had not made press releases about it.  Since they did, you are obviously incorrect about them not caring about beating Stockfish for publicity.

As for Leela being a tad weaker clone...Stockfish is still beating Leela quite handily, so...draw your own conclusions if that is your mindset.  However, Leela being built as a machine learning engine and Alpha Zero also being a machine learning engine makes them no more alike than Stockfish and Houdini and Komodo are to each other...Leela matches in no way show a "3rd party" support for Alpha Zero.

Elroch

This is a misleading assessment. The aim with Leela was mainly to emulate AlphaZero in open source. Every piece of information from the 2017 publication was used. After the 2018 publication, Leela was revised to reflect the revelations of certain more minor features of the design of AlphaZero.

Both AIs are deep convolutional neural networks of a similar depth, width and architecture. It's worth remembering that with both, there is no chess intelligence included: both AIs learn everything except the rules from experience, which makes it very different to comparing two hand-engineered engines.

Leela won the February 2019 TCEC CUP-2! It achieved this by beating Houdini in the final, after Houdini had achieved an upset against Stockfish in the semifinal.

Ghost_Horse0
btickler wrote:
Elroch wrote:

No, one of the Stockfish developers asserting the hash allocation had a big effect was not enough. Such an assertion needed to be demonstrated.

Since all that early discussion, there has been a second match between AlphaZero and Stockfish, with all of the possible disadvantages of Stockfish in the earlier match removed (time management, opening book, hash table, etc.). The result was that AlphaZero won almost as convincingly as in the first match, indicating that the total effect of the issues was not huge.

Preprint of 32 page December 2018 article on the later match

Another round of private testing, not a match at all.  You ask for proof in your first line that you don't have for your "side".  If a scientific organization came out and and said they had proven climate change is *not* occurring, but only proved it in private tests, and then other people supported them saying "scientists that support climate change need to show some more substantial proof to refute these new results", you'd laugh at those people.

Yeah, probably computers don't even exist at all. It's all just a big conspiracy like vaccines and a round earth.

DiogenesDue
Elroch wrote:

This is a misleading assessment. The aim with Leela was mainly to emulate AlphaZero in open source. Every piece of information from the 2017 publication was used. After the 2018 publication, Leela was revised to reflect the revelations of certain more minor features of the design of AlphaZero.

Both AIs are deep convolutional neural networks of a similar depth, width and architecture. It's worth remembering that with both, there is no chess intelligence included: both AIs learn everything except the rules from experience, which makes it very different to comparing two hand-engineered engines.

Leela won the February 2019 TCEC CUP-2! It achieved this by beating Houdini in the final, after Houdini had achieved an upset against Stockfish in the semifinal.

How is that misleading?  You don't think that Houdini and Komodo race to incorporate any useful changes revealed in Stockfish's open source?  Same thing.  

TCEC Cup-2...now *that* is misleading.  Stick to the real tournament with a 100 game final, not some single elimination sideshow.  Gull could have won that on a really good day.  Ergo the reason the WCC is a 12 game match and not single elimination, and the reason why Caruana can win a tournament that Carlsen is in but lose to him in a match.

It won't be that long til Leela wins, just don't count any chickens yet.

DiogenesDue
Ghost_Horse0 wrote:

Yeah, probably computers don't even exist at all. It's all just a big conspiracy like vaccines and a round earth.

I have a 20+ year career in computer software & hardware.  You'll have to come up with a real argument I'm afraid.  Take some time out from creating new trolling accounts and maybe you'll come up with something.

Ghost_Horse0
btickler wrote:
Ghost_Horse0 wrote:

Yeah, probably computers don't even exist at all. It's all just a big conspiracy like vaccines and a round earth.

I have a 20+ year career in computer software & hardware.  You'll have to come up with a real argument I'm afraid.

Ok, so make a firm assertion I can argue against. They were private games? Ok, I agree.

Google manipulated it in such a way that the weaker engine won? I disagree.

Elroch
btickler wrote:
Elroch wrote:

This is a misleading assessment. The aim with Leela was mainly to emulate AlphaZero in open source. Every piece of information from the 2017 publication was used. After the 2018 publication, Leela was revised to reflect the revelations of certain more minor features of the design of AlphaZero.

Both AIs are deep convolutional neural networks of a similar depth, width and architecture. It's worth remembering that with both, there is no chess intelligence included: both AIs learn everything except the rules from experience, which makes it very different to comparing two hand-engineered engines.

Leela won the February 2019 TCEC CUP-2! It achieved this by beating Houdini in the final, after Houdini had achieved an upset against Stockfish in the semifinal.

How is that misleading?  You don't think that Houdini and Komodo race to incorporate any useful changes revealed in Stockfish's open source?  Same thing.  

TCEC Cup-2...now *that* is misleading.  Stick to the real tournament with a 100 game final, not some single elimination sideshow.  Gull could have won that on a really good day.  Ergo the reason the WCC is a 12 game match and not single elimination, and the reason why Caruana can win a tournament that Carlsen is in but lose to him in a match.

What? The TCEC CUP-2 matches are a minimum of 8 games. Unlike the WCC tiebreaks (which are in pairs) have the same time control right up to 32 games (almost always enough).

  1. If a match is tied after its scheduled regular 8 games, pairs using the same book exit for both sides will be played until a decisive pair occurs. The book exits will be from the randomized book used in that phase of the CUP (A or B), up to a maximum of 4 pairs of games.
  2. If after playing in this way, no winner ensues, more pairs of games will be played, after each of which a match winner may ensue. From this point on, so from game 17 onwards, BOOK C will be used (the Superfinal book) with each playing both sides of the openings, for a maximum of 8 pairs of games, so a maximum of a further 16 games to decide a winner, with a new book.
  3. If even after these 32 games, a match is still drawn, further pairs of games will be played with BOOK C to determine a winner, but the time control (TC) will be shorter with each pair of games, according to the following steps (always indicated as minutes base time + seconds increment per move completed, so e.g. 30+5 means 30 minutes base time per game plus an increment of 5 seconds per move completed): 16+4, 8+3, 4+2, 2+1, 1+1. If even after this sequence of pairs of games with shorter TC the match is tied, the increment will remain at 1 second, but the base time will then become even shorter than one minute, in the following manner: 32s+1, 16s+1, 8s+1, 4s+1, 2s+1 and finally 1s+1 will be played until a decisive pair is reached.

 

DiogenesDue
Ghost_Horse0 wrote:
btickler wrote:
Ghost_Horse0 wrote:

Yeah, probably computers don't even exist at all. It's all just a big conspiracy like vaccines and a round earth.

I have a 20+ year career in computer software & hardware.  You'll have to come up with a real argument I'm afraid.

Ok, so make a firm assertion I can argue against. They were private games? Ok, I agree.

Google manipulated it in such a way that the weaker engine won? I disagree.

You are agreeing with verified reality and disagreeing with something you have no real information about... and you want me to have a firm assertion?  

You're also not reading carefully.  I am not asserting that A0 is weaker than Stockfish, merely that it's not strong enough to play Stockfish straight up publically without a risk of losing the match to the latest asmFish build or something.

So, your assertions for arguing against would actually be:

1. they were private games, done only for testing, supposedly

2. nobody knows which engine is actually better until Google comes out of the closet

3. Google is making press releases claiming crushing victories in an implied match environment anyway

 

Ghost_Horse0
btickler wrote:

You are agreeing with verified reality and disagreeing with something you have no real information about... and you want me to have a firm assertion?  

Well if it's "verified reality" then I should agree with it shouldn't I? tongue.png

Heh, but ok, you're saying I agree with google without having all the data so the onus is on me to back up my claims.

Ok, fair enough, we can't test A0

But Leela seems to be vindication for A0.

FWIW I have issues with the A0 - Stockfish matches too, and argued with Elroch about it in the past... but yes, I do think nn engine(s) are currently better than the best ab engines, and if not now then they will be soon.

DiogenesDue
Elroch wrote:

What? The TCEC CUP-2 matches are a minimum of 8 games. Unlike the WCC tiebreaks (which are in pairs) have the same time control right up to 32 games (almost always enough).

  1. If a match is tied after its scheduled regular 8 games, pairs using the same book exit for both sides will be played until a decisive pair occurs. The book exits will be from the randomized book used in that phase of the CUP (A or B), up to a maximum of 4 pairs of games.
  2. If after playing in this way, no winner ensues, more pairs of games will be played, after each of which a match winner may ensue. From this point on, so from game 17 onwards, BOOK C will be used (the Superfinal book) with each playing both sides of the openings, for a maximum of 8 pairs of games, so a maximum of a further 16 games to decide a winner, with a new book.
  3. If even after these 32 games, a match is still drawn, further pairs of games will be played with BOOK C to determine a winner, but the time control (TC) will be shorter with each pair of games, according to the following steps (always indicated as minutes base time + seconds increment per move completed, so e.g. 30+5 means 30 minutes base time per game plus an increment of 5 seconds per move completed): 16+4, 8+3, 4+2, 2+1, 1+1. If even after this sequence of pairs of games with shorter TC the match is tied, the increment will remain at 1 second, but the base time will then become even shorter than one minute, in the following manner: 32s+1, 16s+1, 8s+1, 4s+1, 2s+1 and finally 1s+1 will be played until a decisive pair is reached.

 

Oh gee, sorry, I only used the single elimination match descriptor from the article you linked...better send them an Email or something?  Misleading irony is misleading.

DiogenesDue
Ghost_Horse0 wrote:
btickler wrote:

You are agreeing with verified reality and disagreeing with something you have no real information about... and you want me to have a firm assertion?  

Well if it's "verified reality" then I should agree with it shouldn't I?

Heh, but ok, you're saying I agree with google without having all the data so the onus is on me to back up my claims.

Ok, fair enough, we can't test A0

But Leela seems to be vindication for A0.

FWIW I have issues with the A0 - Stockfish matches too, and argued with Elroch about it in the past... but yes, I do think nn engine(s) are currently better than the best ab engines, and if not now then they will be soon.

Then we are in agreement.  I posted back in 2014 or so that traditional engines would eventually fall to engines that bootstrapped their own knowledge of chess, dropping the opening databases, and eliminating the evolved biases and errant valuations that human beings have created over the centuries (even today, traditional engines communicate their evaluations using the fundamental flaw of the centipawn, but machine learning engines have no human-imposed value framework) by sheer crushing force of billions and billions of games played.

I have no problem cheering A0 and Leela on.  It's Google's methodology, which is in conflict with their corporate core values btw, that I have a problem with.

Ghost_Horse0

Nice observation. In 2014 I certainly hadn't thought of that.

And yeah, seems we agree.

Elroch
btickler wrote:
Elroch wrote:

What? The TCEC CUP-2 matches are a minimum of 8 games. Unlike the WCC tiebreaks (which are in pairs) have the same time control right up to 32 games (almost always enough).

  1. If a match is tied after its scheduled regular 8 games, pairs using the same book exit for both sides will be played until a decisive pair occurs. The book exits will be from the randomized book used in that phase of the CUP (A or B), up to a maximum of 4 pairs of games.
  2. If after playing in this way, no winner ensues, more pairs of games will be played, after each of which a match winner may ensue. From this point on, so from game 17 onwards, BOOK C will be used (the Superfinal book) with each playing both sides of the openings, for a maximum of 8 pairs of games, so a maximum of a further 16 games to decide a winner, with a new book.
  3. If even after these 32 games, a match is still drawn, further pairs of games will be played with BOOK C to determine a winner, but the time control (TC) will be shorter with each pair of games, according to the following steps (always indicated as minutes base time + seconds increment per move completed, so e.g. 30+5 means 30 minutes base time per game plus an increment of 5 seconds per move completed): 16+4, 8+3, 4+2, 2+1, 1+1. If even after this sequence of pairs of games with shorter TC the match is tied, the increment will remain at 1 second, but the base time will then become even shorter than one minute, in the following manner: 32s+1, 16s+1, 8s+1, 4s+1, 2s+1 and finally 1s+1 will be played until a decisive pair is reached.

 

Oh gee, sorry, I only used the single elimination match descriptor from the article you linked...better send them an Email or something?  Misleading irony is misleading.

They are surely at fault for not sending you an e-mail to explain that a match is not a game.

DiogenesDue
Elroch wrote:

They are surely at fault for not sending you an e-mail to explain that a match is not a game.

This article you linked does the exact same thing Google did, by hyping TCEC Cup-2 as "single elimination" in a misleading way to make the format seem more exciting, by lying about it.  Oh, sorry, I guess I should just call it "Marketing" instead of misleading, even though these terms are pretty much interchangeable at a functional level.  Alpha Zero is to official match as TCEC Cup-2 is to single elimination.

Western society has reached a dangerous milestone where outright lying in advertising is expected and accepted as the norm.  At its most basic level, fraud is defined by motivation, not by degree of how many people are mislead or care that they were mislead.