A 3000 could easily beat a 2000, but could a 4000 easily beat a 3000?

Sort:
llama

I'm not sure what you mean, but we can probably agree that if you want to test your engine vs another person's, then it would be better to have each side choose the hardware and settings they believe are optimal, and then if both sides agree they can have a match.

Not some silly match where your choose the hardware and settings for the opponent, play all the games in secret, and only release a few of them (as AZ team did)

AFAIK NN engines like leela calculate slower (in terms of positions per section)... but that's not a defect, it's just how it operates.

drmrboss
MARattigan wrote:
llama wrote:
Elroch wrote:

Stockfish looked pretty imperfect against Alphazero.

Sure, but imperfection is a low bar.

More to the point, Stockfish looked pathetic after they released an extremely limited set of data which put their product in the best light. People forget that AZ's "crushing" victory was a measly (IIRC) 62% even though SF had questionable hardware -- good enough to be rated only 100 points higher. Overwhelmingly the games were drawn.

I think the hardware has everything to do with it. 

LC0 uses the same approach as AZ and is usually regarded as strong, but it also usually runs with a GPU that gives it hundreds or thousands of extra processors compared with probably four used by SF.

I recently tried downloading an LC0 version that runs without video card to see if it played basic endgames any better than SF on the same hardware. The answer was that LC0 is totally useless with the same hardware.

There is no point comparing a software architecure built to run on certain hardware vs inappropriate hardware.

Lco is heavily rely on Neural Network which require parallel/matrix calculations.( gpu's strength)

Stockfish is build on AB search that requires sequential computing ( cpu's main strength)

 

If you are running Lco on CPU, it means you are asking 10 tons big truck to delivery 1000 envelops into 1000 customers, wheres GPU are like 1000 bikies running parallel to deliver 1000 parcels simulataneoulsy.  That is why cpu on lco is way slower than GPU. 

 

If you dont have GPU, dont use Lco- period.  It will be  200-400 elo weaker than Stockfish. 

MARattigan

The fact is that the video card adds a lot of grunt so comparing a program that uses the GPU with one that doesn't is a non comparison. 

drmrboss
MARattigan wrote:

The fact is that the video card adds a lot of grunt so comparing a program that uses the GPU with one that doesn't is a non comparison. 

AMD already build GPU/CPU combo . Look at techonoloy websites.

Within 5 years, there will be combined GPU and CPU build in most basic hardwares, because of increasing graphic demands.

If you dont have a Graphic card, it is your problem, but it is not a good way of comparasion. Dont judge the speed of a whale struggling on land, the whale will be much faster on water.

MARattigan

So can LC0 play KNNKP if you do have a video card? Otherwise it's not worth the investment because SF already beats me on full games anyway. I'm looking for something to practise basic endgames against that's nearly but not quite perfect.

drmrboss
MARattigan wrote:

So can LC0 play KNNKP if you do have a video card? Otherwise it's not worth the investment because SF already beats me on full games anyway. I'm looking for something to practise basic endgames against that's nearly but not quite perfect.

It is your choice, it is your money. You do invest whatever you like.

 

But it is already known Stockfish and Lc0 are two best engines. People who can afford for GPU use both as both have their own weakness and strength.

Mako_Cat
Brixed wrote:
EndgameStudier wrote:

A 3000 could easily beat a 2000, but could a 4000 easily beat a 3000?

It'd be the same result.

Statistically speaking, in a long match between a 3000 and a 4000, the 4000 would win hundreds of games in a row before the 3000 won a single game (1 win for every 315 losses, to be precise).

The 3000 would certainly put up a stronger fight (and the games would last longer) than a 2000 vs. 3000, but the 4000 would still dominate all the same.

Yes but how many thousands of draws would there be. The result at the end of the match would be ridiculous. Like 315-1-18385

llama
Mako_Cat wrote:
Brixed wrote:
EndgameStudier wrote:

A 3000 could easily beat a 2000, but could a 4000 easily beat a 3000?

It'd be the same result.

Statistically speaking, in a long match between a 3000 and a 4000, the 4000 would win hundreds of games in a row before the 3000 won a single game (1 win for every 315 losses, to be precise).

The 3000 would certainly put up a stronger fight (and the games would last longer) than a 2000 vs. 3000, but the 4000 would still dominate all the same.

Yes but how many thousands of draws would there be. The result at the end of the match would be ridiculous. Like 315-1-18385

You're getting confused because the numbers are big.

Let's say I we play 50 games and I win only one of them (and we draw the other 49).

Would you feel like I'm a lot better than you? Of course not. We would both agree we're basically the same strength.

But let's say we did this every day for a whole year!

Now instead of 1 win and 49 draws I have 365 wins and almost 18,000 draws.

has anything changed? No. Don't be fooled by larger numbers, we're still almost exactly the same strength. In fact I'd be rated less than 5 rating points higher than you.

---

If a 3000 and 4000 played 18,000 games (and if their ratings were accurate, and if their results reflected their skill as predicted by Elo) then the minimum number of wins out of 18,000 would be 17,886 (meaning the other 114 were drawn).

Anything less than 17886 wins would mean the 4000 would lose rating points.

Mako_Cat

Yes you are confusing me a bit. I was just saying that in @Brixed’s comment he said that the 4000 would win 315 games compared to the 3000’s 1 win. I am agreeing with you and saying that that doesn’t mean the 4000 is really much better. Because there would be so many thousands of draws.

llama

I'm saying that that the maximum number of draws in a 18000 game match is less than 1%. Otherwise the 4000 would lose rating points.

The maximum value is a little larger than half a percent.

Mako_Cat

We’ll suppose they are playing unrated games. We only care about their rating at the beginning of the match, not the end.

llama

So no, there would not be many thousands of draws. In a 18000 game match, that maximum number of draws is 114.

In other words, if there is even one more draw, i.e. if there are 115 draws out of 18000 games, then the 4000 rated player's rating will go down. Even if the 4000 player wins all the other games drawing 115 is too many.

MARattigan
drmrboss wrote:
MARattigan wrote:

So can LC0 play KNNKP if you do have a video card? Otherwise it's not worth the investment because SF already beats me on full games anyway. I'm looking for something to practise basic endgames against that's nearly but not quite perfect.

It is your choice, it is your money. You do invest whatever you like.

 

But it is already known Stockfish and Lc0 are two best engines. People who can afford for GPU use both as both have their own weakness and strength.

One of SF8's weaknesses is it can't play KNNKP. SF11 in the guise of Lichess level 8 can't play it either, but it can't play it a lot better than SF8 can't play it, so maybe given a reasonable time control it could.

Does anybody know whether either SF11 or LC0 with GPU can play the endgame given reasonable time controls?

sabertooth1410

guys how do u do so much math? i suck

at it

 

llama
Mako_Cat wrote:

We’ll suppose they are playing unrated games. We only care about their rating at the beginning of the match, not the end.

Rating predicts results. Rating is only adjusted when the results don't match the prediction.

If the 4000 and 3000 players are 100% accurately rated, then this will be their result.

Now... in the real world that's not quite how it works. Ratings formulas weren't meant to predict players with such a large skill gap. Probably the 3000 rated player would score a little better than what the pure math predicts... but as I said before, if you're drawing many thousands of games and only winning a few hundred, then the two players are equal in rating and skill.

---

Now, if we want to assume the ratings are inaccurate for one reason or another, then of course anything could happen. For example 18000 unrated games, and the 3000 player slowly learns and improves during the match but the 4000 does not. Or the 4000 gets sick. Etc.

But if the ratings are accurate, and no funny business, then there will be thousands of wins, and only a few draws. E.g. only a little more than 100 draws in a 18000 game match.

Mako_Cat
llama wrote:

So no, there would not be many thousands of draws. In a 18000 game match, that maximum number of draws is 114.

In other words, if there is even one more draw, i.e. if there are 115 draws out of 18000 games, then the 4000 rated player's rating will go down. Even if the 4000 player wins all the other games drawing 115 is too many.

So you are saying that mr.4000 needs to draw 114 games max to maintain his rating. I am just saying that in (say an unrated match) mr.3000 would draw a lot more than 114/18,000 games. Mr. 4000 might win 300 more games than his opponent, but that doesn’t matter. His win percentage would still be .017% meaning that the 4000 player isn’t really much better than the 3000 player. (And yes at the end of the match the two player ratings would be a bout the same if they were playing a rated match)

MARattigan

"For example 18000 unrated games, and the 3000 player slowly learns and improves during the match but the 4000 does not. Or the 4000 gets sick. Etc."

After 18000 games he probably would get sick of it.

Mako_Cat
llama wrote:
Mako_Cat wrote:

We’ll suppose they are playing unrated games. We only care about their rating at the beginning of the match, not the end.

Rating predicts results. Rating is only adjusted when the results don't match the prediction.

If the 4000 and 3000 players are 100% accurately rated, then this will be their result.

Now... in the real world that's not quite how it works. Ratings formulas weren't meant to predict players with such a large skill gap. Probably the 3000 rated player would score a little better than what the pure math predicts... but as I said before, if you're drawing many thousands of games and only winning a few hundred, then the two players are equal in rating and skill.

---

Now, if we want to assume the ratings are inaccurate for one reason or another, then of course anything could happen. For example 18000 unrated games, and the 3000 player slowly learns and improves during the match but the 4000 does not. Or the 4000 gets sick. Etc.

But if the ratings are accurate, and no funny business, then there will be thousands of wins, and only a few draws. E.g. only a little more than 100 draws in a 18000 game match.

But I think that at this level the two players are playing near perfect moves. Even though math says a +1000 point player will win 99.9% of the time, it doesn’t apply if both players play perfect. 

llama
Mako_Cat wrote:
llama wrote:
Mako_Cat wrote:

We’ll suppose they are playing unrated games. We only care about their rating at the beginning of the match, not the end.

Rating predicts results. Rating is only adjusted when the results don't match the prediction.

If the 4000 and 3000 players are 100% accurately rated, then this will be their result.

Now... in the real world that's not quite how it works. Ratings formulas weren't meant to predict players with such a large skill gap. Probably the 3000 rated player would score a little better than what the pure math predicts... but as I said before, if you're drawing many thousands of games and only winning a few hundred, then the two players are equal in rating and skill.

---

Now, if we want to assume the ratings are inaccurate for one reason or another, then of course anything could happen. For example 18000 unrated games, and the 3000 player slowly learns and improves during the match but the 4000 does not. Or the 4000 gets sick. Etc.

But if the ratings are accurate, and no funny business, then there will be thousands of wins, and only a few draws. E.g. only a little more than 100 draws in a 18000 game match.

But I think that at this level the two players are playing near perfect moves. Even though math says a +1000 point player will win 99.9% of the time, it doesn’t apply if both players play perfect. 

This is another way of saying their ratings are inaccurate.

There really may be a ceiling for a perfect player that is lower than 4000. This has been talked about before a lot in different forums (not on chess.com).

For example let's say a 32 man EGTB (which is a database that will tell you "mate in #" or "draw" for every legal move in every possible position) has a rating of 3500. This would mean the 4000 rated player's rating is inaccurate, so of course there will be many more draws than predicted.

But assuming the ratings are inaccurate is silly. The question was about a 3000 vs a 4000, and in that case if the ratings are accurate then the overwhelming majority of games will be won for the 4000 rated player.

llama
sabertooth1410 wrote:

guys how do u do so much math? i suck

at it

It helps a lot that I'm in my 30s and I paid attention in math class as a kid.

If I were 13 years old and reading this forum topic I'd have no freaking idea how someone could calculate something so amazing.

But it's actually not that hard tongue.png

I'm even using Elo instead of Glicko to make it even easier on myself heh. It takes like... 1 or 2 minutes.

So yeah, maybe one day you'll be doing this (or something a lot more complicated)

 

And since I rounded it's actually 17887 wins and 113 draws (no losses) which... doesn't really matter but started to annoy me so I'm mentioning it now lol.
I mean, either way, the difference is so small in terms of percentage that the rating update would be zero... but if we're counting millionths of a rating point...