Can the top players beat the best computers anymore?

Sort:
ozzie_c_cobblepot
Elroch wrote:
SteveCollyer wrote:
Elroch wrote:

SteveCollyer, if you were a chess.com player, your views might be worth reading. Your unrated standing in all classes of play may give you the confidence to shout at the tv from your couch, but it gives your opinions no weight.


Deep Rybka 3 on a quad-core does though, matey.


No, Steve, even assisted by Deep Rybka, your opinions are those of a non-playing spectator.


So what would give these opinions a greater level of weight? Would you need one or several titled players to concur?

bondocel
Elroch wrote:
Here's a summary of how the decisions were reached in every move where Shredder thought there was an alternative no more than 0.1 different. 

Things look quite clear: two computers shuffled some pieces around and in the end they agreed to a draw. Monkeys watched their moves and said "yes! that is the move I would have chosen myself". At no point White had a serious attack. One have to risk, to push pawns to open files, not just simply move Qg4 and say "I have a compelling attack".

Elroch

@bondocel, it may protect your ego to make an accusation which is utterly inconsistent with the way the decisions were made (as is entirely on record), but this makes me feel sorry for you. Presumably the huge disparity between your bullet rating and your turn-based rating has led you to believe that every chess.com player with a turn-based rating way higher than their quick ratings is dishonest. This is false. You might do better to realise that you have room for improvement in your turn-based chess (I don't blame my recent results on my opponents using computers).

In truth, in the Alliance-Shredder game several different players' choices ending up being chosen after discussions that were sometimes lengthy and well-contended. No individual player had a dominant role, and there were often disagreements decided based on a combination of positional ideas and analysis.

I presume that your pronouncement that white never had a serious attack had the purpose of demeaning white's play, you having forgotten that you had just claimed that white was a computer as well. Our initiative saved the game, showing that the computer's assessment was slightly in error (it turns out we went from a dangerous looking negative evaluation to a draw), but was clearly not enough to win the game, although different members of the team believed it was at various stages (entirely inconsistent with them using computer assistance).  Presumably you think you could have done better?

ozzie_c_cobblepot

Occam's razor suggests that the white side had at least one competent advanced player.

Elroch

That would not explain the agreement, since the credit for the moves that were played has to be shared among several people who were active in the discussions, as can be easily seen if you scan through the discussions of the key moves. None of these people were perfect in their suggestions, assessments or analysis, but in no case was everyone convinced of a move that was much less than best. The discussions worked as well as they could have to overcome the inconsistency of individual players.

Several discussions and votes were indecisive and in others there was a sudden realisation near the end of the discussion that a different move was better. My conclusion (based on considerably more information that you have unless you have looked through the discussion) is that the agreement is a result of several factors.

Firstly, a lot of the moves that the computer identified as being marginally the best were much more natural, and seemed clearly the best to strong human players - tempo moves and the most active looking moves comprise a large fraction of these. I infer from this that a very strong computer is able to reduce the disadvantage from playing a somewhat worse move to very little in a position that was never winning for either side. Of course a perfect computer would simply have a list of moves with identical evaluations of 0 in a position where neither side is winning.

Secondly, the quite numerous moves where there was an indecisive discussion and an indecisive vote mostly ended up with the vote happening to fall for the move that happened to be the computer choice (in some cases even a tied vote). Clearly this is a fluke. If four or five of these moves had fallen another way, the statistics would seem very different.

I'm sure there was a third reason as well, but can't quite recall what it was at the moment. Smile

ozzie_c_cobblepot

That does not address my point.

Let me paint a picture. Let's say that me, Reb, and tonydal are on one side of a vote chess game. Let's further say that two of us are using help. I contend that the third would not figure out that the other two were using help. It's easy to fashion a positional argument in favor of some recommended move. Or to do blunderchecking before making suggestions. I think it's a classic [modified] Turing Test. How would you know if your opponent is using help?

You're right, I haven't looked through all the comments yet, I may though. I did play through the game this morning though, while looking at your comment responding to each "close" move in turn.

Elroch

It does address your point, since there is no one player who was the main proponent of the choice of move in more than a substantial minority of the moves. And even if had been such an "advanced" player, they would have had to have consciously made errors of assessment and argue for moves that turned out not to be best on some moves. To call this implausible is an understatement.

Unless of course you are claiming that the team was engaged in an elaborate and carefully staged conspiracy. Anyone who would believes that probably believes no-one has ever set foot on the moon. [Hint: Capricorn One isn't a documentary]

Atos

A strong human player can often recognize the computer's choice as being the best move, once they are presented with it. (Even though they wouldn't naturally play it.) That seems consistent with people having differing opinions at the start, and changing their mind after the discussion.

In those cases where the opinions remained differing after the discussion, the computer choice was still the one that was voted out. That seems consistent with perhaps a half of the group using assistance.

ozzie_c_cobblepot

Hey, anything's possible. After all, it's just one game.

I still contend that you wouldn't be able to know if one/any of the contributors used assistance.

Hey, it could even be that they only used it _sometimes_.

chessroboto

Judge engine-assistance based on a single game? Might as well flip a coin!

Elroch

I agree with ozzie's first statement. Small samples of data give unreliable inferences.

On the second point, some of the players I know well enough to be absolutely sure about, but that's my personal judgement.

Of course, I can't logically eliminate the third possibility, but don't see any reason to believe that if most of the game was honest, some random part of it was not.

Atos
chessroboto wrote:

Judge engine-assistance based on a single game? Might as well flip a coin!


I don't think that we are "judging," just pointing out that, as evidence that a team of human players can have good results against an engine due to the ability to discuss moves between themselves, this example is highly dubious.

Kacparov

funny thread & 17,500 points Laughing

Elroch

I agree with Atos. Using a popular threshold for statistical significance, a draw in one game is significant evidence that we should get at least one draw in 20 games (since if our chance of drawing was 1 in 20, to draw one game is a p=0.05 event).

bondocel
chessroboto wrote:

Judge engine-assistance based on a single game? Might as well flip a coin!


You might flip as many coins as you want, but always selecting the optimal move, when many other candidates moves are within centipawns, it's not what humans ussually do.

I've heard once Svidler on ICC saying that he doesn't care when an engine evaluates a move in the range -0.50 --- +0.50. He added that this is just an arbitrary number, the way in which a machine with no chess knowledge understands that the position is equal.

Did you have Yelena Dembo in your team? That would explain some things Tongue out

ozzie_c_cobblepot

It's a red herring to talk about a single-game fallacy. Hey, a game can have 100 moves! If one side shows repeatedly that it chooses the top engine choice from among several competing choices all very close in evaluation, it's suspicious.

A small sample size problem does occur in the following game though, where I demonstrated 100% matchup with Rybka first choice for the entire game, once we went out of book.

ozzie_c_cobblepot

wait, krill57 is on the team in question and has a turn-based rating here of 2810?

Inneresting.

orangehonda
Elroch wrote:

I agree with the first statement. Small samples of data give unreliable inferences.

Some of the players I know well enough to be absolutely sure about, but that's my personal judgement.

Of course, I can't logically eliminate the last possibility, but don't see any reason to believe that if most of the game was honest, some random part of it was not.


Except that by definition, if most of the game was honest, then the remainder is dishonest Tongue out

Elroch
orangehonda wrote:

Except that by definition, if most of the game was honest, then the remainder is dishonest


Not so. If you have a population, the fact that most of the population has a property does not imply that not all the population has the property. For example, it you had 10 horses and identified that 8 of them were black, it would not imply that one of the others was not black.

Elroch
notlesu wrote:

Well, he list his rating as 2810 and his residence as Russia---so its not USCF!

I wonder what the odds are of someone having a turn based rating of 2810 and not using computer assistance---10%---5%----2%!!!


As a mathematician, I have a very low opinion of fabricated statistics, which wise people never use (especially with multiple self-congratulatory exclamation marks), And, very importantly, I would point out that a particular individual is either honest or not. Probabilities apply only to populations, not to a specific individual. Just face it, notlesu, there are players here who are very much stronger than you, regardless of the fact that there are others who cheat.

This forum topic has been locked