Can the top players beat the best computers anymore?

Sort:
justaknight_jak
bondocel wrote:

Yelena Dembo might know such a strategy.


 You have to congratulate Yelena for her great performance at the Olympiad. Her win agains US Irina Krush should convince anyone about her abilities! As for the topic, I believe that no, top players cannot beat computers anymore but they can always program them! Smile

Niven42

Charles Galofre's opinion on the matter pretty much settled it for me.

chessroboto
Elroch wrote:
chessroboto wrote:
Elroch wrote:

Is there a limit to the standard of play in terms of ratings, or will they continue to increase to 4000+ in the coming decades? Other evidence would be if games between computers of a similar level get increasing fractions of draws as they get stronger (as is the case for humans), indicating that they approach a level of adequacy to avoid losing against each other.

If there is some sort of threshold, it may be possible with enough (human) effort to reach a level which is adequate to draw against engines much of the time, even as they increase in speed.

We know that ratings increase with each additional win against equally-rated opponents. As long as the engine keeps winning, its rating will continue to rise. Even when the second-best opponent does not increase in rating, the winner's increase will only be slower but it will not stop as long as it does not stop playing.

This is only true if the stronger computer wins all of the time. Even the best chess computers at present lose a few games and draw a lot more against their lower rated opposition. With an adequate number of games, a computer finds their true rating level and maintains it. In order for the ratings to rise, they need to get stronger, not merely to play more.


Unfortunately, we have not seen enough competitions where the same configurations of chess engines are allowed to play for extended periods of time (eg. continuous matches for at least a year). With that, I cannot back my theory that continuous play amongst the same chess engines would yield continuously increasing rating points.

The only thing that we know for sure is that chess engine algorithms and opening books have been changing for the better every year, so we see a steady increase in their rating scores.

SteveCollyer
notlesu wrote:
Elroch wrote:

notlescu, I am reminded of the multiple times someone who has just been crushed in a blitz game by me cries "computer". It was an explicitly a match between a human team against a computer, with no assistance. There is no need to make (joking) accusations to slight the achievement of those who put in a lot more effort than I did (see the discussion).


 Elroch, you say---" It was an explicitly a match between a human team against a computer, with no assistance."

Elroch, looks like you and the team have  been busted---

{ Game Summary }

{ White: The Chess.com Alliance }
{ Top 1 Match: 23/29 ( 79.3% )
{ Top 2 Match: 25/29 ( 86.2% )
{ Top 3 Match: 26/29 ( 89.7% )
{ Top 4 Match: 27/29 ( 93.1% )

Elroch, dont you think you should apoligize for your little fib? After all, you've hung a cloud of suspicion over all your team mates. A simple, sincere, "I'm sorry" will do.


As I say, the 11 times the team elected to play the top engine choice move when there were 3 other candidates all within a few centipawns of that move & also that there were only 2 very minor -0.20 inaccuracies are perhaps more interesting factors than match up %'s for this single game.

Elroch

SteveCollyer, if you were a chess.com player, your views might be worth reading. Your unrated standing in all classes of play may give you the confidence to shout at the tv from your couch, but it gives your opinions no weight.

orangehonda
Elroch wrote:

Take a look at the discussion and check the lines presented against computer lines. You will find disagreements, and (non-critical) errors in evaluation that do not support your claim.

It is amusing that in the period the agreement was best (the 10 move sequence), the computer showed its imperfection, as the assessment at the end was very different to the start.

I know I would not even consider looking at a computer evaluation during a game like this, and the same is true for any chess player I respect - what would the point be?


Maybe sloughterchess could offer an explanation.

slvnfernando

I want to play chess with another human being whether it is OTB or online! We have enough of machines coming in to the scene in all areas of life. Not in Chess too, please!

SteveCollyer
Elroch wrote:

SteveCollyer, if you were a chess.com player, your views might be worth reading. Your unrated standing in all classes of play may give you the confidence to shout at the tv from your couch, but it gives your opinions no weight.


Deep Rybka 3 on a quad-core does though, matey.

Elroch
notlesu wrote:


{ White: The Chess.com Alliance }
{ Top 1 Match: 23/29 ( 79.3% )
{ Top 2 Match: 25/29 ( 86.2% )
{ Top 3 Match: 26/29 ( 89.7% )
{ Top 4 Match: 27/29 ( 93.1% )

{ Game Summary }

Elroch, dont you think you should apoligize for your little fib? After all, you've hung a cloud of suspicion over all your team mates. A simple, sincere, "I'm sorry" will do.


@notlesu, you seem to be a fairly competent blitz player, but perhaps you don't realise that very much higher standards can be achieved by thorough analysis. And these standards can be improved when several players have time to present and discuss their differing views.

It is you who should be humbly begging for forgiveness, to have cried "computer!" based on statistics that show considerable disagreement with computer moves. How do you explain the large number of disagreements, even 2 with the top 4 computer moves? Read the discussion and see how the team argued about many of the moves, and the differing assessments that were put forward before some sort of consensus was reached. I have no reason to believe that any particular person was using assistance, but even if one person had been, there were many different people putting in their differing views, and several of these views were eventually accepted on different moves.

And again, I simply can't see why anyone would bother to waste their time presenting computer aided analysis in a game between a human team and a computer. It totally misses the concept.

Elroch
chessroboto wrote:

Unfortunately, we have not seen enough competitions where the same configurations of chess engines are allowed to play for extended periods of time (eg. continuous matches for at least a year). With that, I cannot back my theory that continuous play amongst the same chess engines would yield continuously increasing rating points.

The only thing that we know for sure is that chess engine algorithms and opening books have been changing for the better every year, so we see a steady increase in their rating scores.


Yes we would expect to, but the largest factor is probably the increase in processing power, followed by improvements in the algorithms.

Elroch
SteveCollyer wrote:
Elroch wrote:

SteveCollyer, if you were a chess.com player, your views might be worth reading. Your unrated standing in all classes of play may give you the confidence to shout at the tv from your couch, but it gives your opinions no weight.


Deep Rybka 3 on a quad-core does though, matey.


No, Steve, even assisted by Deep Rybka, your opinions are those of a non-playing spectator.

SteveCollyer

I'm certainly not saying that chess.com alliance cheated in that game, as you say Elroch, what would be the point?

I just say that Deep Rybka 3 on my system likes the way that your team tackled the engine opposition in terms of it's own choices of move, especially the 11 examples when your team chose the top move when there was a negligible difference in evaluation between all top 4 moves.

The analysis I posted is just about as transparent as is possible, giving the over all analysis conditions, score at each ply for each top 4 move, depth reached & so on.

Making patronising comments about me or anyone else you disagree with seems a bit petulant tbh.

Why not post some evidence which contradicts my analysis instead?  That might be a bit more constructive.

Elroch

@SteveCollyer, sorry, I incorrectly thought you were infering that cheating had taken place.

In any case, any such claim has to be examined in a logical manner, not a knee-jerk way.

Firstly, there was certainly no conspiracy to use computer assistance, so any accusation can only be against one or more of the individual players. The game was played in a fairly well-organised way, where players first put forward their views as analysis or positional ideas without voting. These views often disagreed which led to further discussion. It had been agreed that no voting would take place until there had been a discussion and some sort of consensus, as this approach clearly achieves the best standard of play possible.

If any accusation is made, it has to be against one or more of the individual players, based on their suggestions, not on the consensus that was finally reached after the discussion. But best is to state hard facts rather than to make inferences.

The entire discussion is (I believe) publicly accessible, so please feel free to analyse the contributions of any of the individual people who contributed.

I would have expected a lower degree of agreement with the computer's choices, especially when there are moves with very similar computer assessments, but the fluky-looking level of agreement is very unlikely to be statistically significant.

Elroch

@notlesu, I believe you are making the elementary error of comparing statistics from many games with statistics from one game. This is analogous to tossing a coin a few times and pronouncing it to be biased. Worse, I strongly believe you fabricated the statistical claim you made about three top players.  As an interesting example of way better agreement with computer choices, Kramnik achieved 87% agreement with the 1st choice of Fritz in one over the board game, and still managed to make significant errors. Only an idiot would claim he was cheating. Actually, one did.

If a similar level of agreement was achieved in many games it might be possible to reach a different conclusion, but it would be a mistake to assume the same level of agreement would persist. Chess.com rightly needs many games to identify if computer assistance has been used.

bondocel
justaknight_jak wrote:
bondocel wrote:

Yelena Dembo might know such a strategy.


 You have to congratulate Yelena for her great performance at the Olympiad. Her win agains US Irina Krush should convince anyone about her abilities!


The recent Olympiad was simply a disaster for Yelena compared with the level she showed here. Her games there were full of tactical mistakes, while here she made zero tactical errors in dozens of games.

bondocel
notlesu wrote:

Lets look at the facts---Anand, Kramnik, and Carlsen (TOGETHER) couldn't come up with a percentage of 93% of the top 4 moves of Deep Rybka. And you and your Chess.com teammates did?


That's the beauty of internet chess, isn't it :)

Elroch

 What, people fabricating statistics off the tops of their heads and posting them in forums? Take a look at my reply to that post. Analysis of Kramnik's games shows he often gets better agreement with computers. Even without help from Anand and Carlsen. Laughing

SteveCollyer
The headline match up rate %'s mean very little in a single game. What is more interesting is how often & by how much played moves deviate from the top engine choice move in terms of score. Also the amount of inaccuracies in a game in positions where there are several reasonable candidate moves. Humans tend to make them fairly regularly whereas an engine won't.
Elroch
notlesu wrote:
Elroch wrote:

 What, people fabricating statistics off the tops of their heads and posting them in forums? Take a look at my reply to that post. Analysis of Kramnik's games shows he often gets better agreement with computers. Even without help from Anand and Carlsen.


 Of course, Kramnik was accused of cheating.

But---facts are facts---

 Top 1 Match: 23/29 ( 79.3% )
{ Top 2 Match: 25/29 ( 86.2% )
{ Top 3 Match: 26/29 ( 89.7% )
{ Top 4 Match: 27/29 ( 93.1% )

23 of 29  of your teams moves coincided with deep Rybka3's top choice. I am no expert like Mr. Collyer---but it seems to me that either your group was composed of super elite GMs or---you had computer assistance. I think the evidence is overwhelming!

Just pay the two dollars.


@notlesu, I don't donate to beggars, however menacing they get. If you can play the banjo, your chances might improve.

Your conclusions are all incorrect, as the sample is too small. If similar results were achieved over 10 games, it might be a different matter, but that is a purely hypothetical situation.

If you had as good an understanding of statistics as those who write the chess.com routines to catch cheaters, you'd realise this. But presumably your gut tells you otherwise.

Here's a summary of how the decisions were reached in every move where Shredder thought there was an alternative no more than 0.1 different. It is interesting that in several cases the move played seems clearly better to human players, even though Shredder considered them close. In several other cases we had no real agreement, and the fact that the first choice of Shredder was chosen was really a fluke.

15.Bxf3    6:5 Indecisive discussion, with more argument for the move not played, the more natural move being chosen by more people.
16.Rb1    3 moves discussed, a lot of disagreement, but in the end we all went with the move any blitz player would probably have played :-)
18.e4      3:3:2:1 tied vote happened to go with the move played
19.Bh5    We all agreed with the most aggressive and forcing move once it had been suggested (after I had suggested the timid-looking and probably inferior Bg2) Good example of how team discussion avoided individual errors
20.Be2    By far the most natural, tempo move, agreed by (almost) all after two suggestions
21.Bd3    Planned previous move, using the tempo to reroute the bishop. A natural move, lining up pieces at the black king in a typical way.
23.Qg4    Three moves routing the queen to the attack were discussed, and the one
played seemed "by far the strongest", since it was the only one that maintained the initiative. It turns out Shredder thought it was only 0.09 better, but may have been wrong as we got a lot of our disadvantage back in the next 10 moves.
24.Qh4    Compelling move forcing a weakening of the black kingside. No disagreement.
25.Bh6    Our kingside attacking plan was on autopilot here (but we didn't see that
black could escape a few moves later by getting behind us on the queenside)
26.Qg4    Pretty obvious continuation of the attack, once we (eventually) realised
the other candidate move letting our h-pawn advance had a tactical flaw.
27.Rfe1   My inferior suggestion got snubbed (again) for a better move from Ivan,
but we missed black's counterplay at this point, and Ivan and some others
thought we were probably winning. (it turns out Shredder still considered us
to be slightly worse)
31.Qd1   After black's counterplay, we were looking for the draw, and discussed
two moves to go for it without a consensus being reached. The move that
ended up being played after a close vote was the more natural tempo move,
now that our kingside attack had been disrupted. The discussion was more
positional than analytical.
32.Bf4    The move we played seemed urgent to everyone - Bf4 obstructed the black
queen, allowing us f3 to kick the black bishop and free our rook. Computers make such simple decisions complicated :-)
37.Kg2    The game was heading inevitably for a draw, and the disagreement about what square to put the king on was not important (It turns out Shredder basically agreed).

chessroboto

Talk about blowing things out of context!

This forum topic has been locked