Databases - what is allowed?

costelus

I have a big question: it is known that the usage of databases is allowed. But, any database? If I put Fritz to battle Rybka on two computers, starting from the position after 10 moves in major openings, and make a database consisting of these games, would it be allowed? Or, if the database consists of the games played by the recent correspondence players (who are all cyborgs), is that allowed? I think that the two examples above would depart quite a lot from the usual term "database", which I understand as "collection of games played between two humans".

I think that the common measure would be great for all: the only database allowed is the chess.com database. Now there is an easy way to search it.

costelus

I think the question is very tricky: if two players follow a game from a database played between two cyborgs, then it will appear as cheating, even if they DO NOT use an engine to get evaluations of the position. 

DeepGreene

Many databases already include games by engines - Deep Blue, at the very least, in many cases. 

I'm not sure, but you seem to imply that consulting a database of positions in computer vs. computer games is akin to consulting an engine, and because engine-use equals cheating, using such a database would also therefore be cheating.  Or close to it.

The problem with this is that, for instance, you're also not allowed to get a grandmaster's opinion on what you should do in a given game either, yet you are entirely within your rights to check a database to see what dozens of GMs may have done in your position.

I think an important point to remember is that, for a database to be of interest, it must actually contain information on the position in your actual game - so hypothetically, the only way your Fritz-Rybka DB is going to shed light is if you and your human opponent never deviate from the choices made by the engines.  This will only work up to a point (if at all). 

And of course it's 'buyer beware' for the person who believes that slavishly following the database percentages will yield the better result once they run out of moves to copy.

Eastendboy

I agree that it's a tricky question.  I know of at least one top player here who readily admits to using opening books and databases built from Playchess engine room games and ICCF games from the last year or two.  He asserts that it's the only "legal" way to avoid getting destroyed by engine users.  He has a point.

costelus

First of all I do not think that this is a discussion about cheating. I do think it is a very interesting question which might concern many people here. I just saw a modern correspondence chess game: cyborg vs cyborg :) Uhh...Cry

Here is a question: If X and Y play a game here, on chess.com and they both follow completely a game played by two centaurs, I will certainly think that they cheat. Even if they DO NOT get computer evaluations of the position.

I also do not think that it is impossible to build a database consisting of let's say 20 millions of games (Mega I think has 3 millions). With such a database, it is likely that you would encounter (and know the perfect moves) for many positions, very deep into the middlegame. Such a database would offer you a significant advantage over a player who uses a book or the Game Explorer here.

Eastendboy

The reason why this strategy of using engine database games against engine users is effective is due to the largely deterministic nature of engines in comparison to human players.  This kind of db would be mostly worthless against human players because they'll deviate early and the game will take it's own course.  But using a db composed of engine games would be extremely effective against an engine user who was blindly following the engine advice.  Brilliant, actually.

I can see how this might greatly complicate the search for cheaters.  I suppose one thing to look for would be how they play against users who are obviously not engine users.  Statistical anomalies might also show up e.g. average rating of opponents for games that end in draws is a lot HIGHER than the average rating of opponents in games that were lost etc.

I don't think it would take anywhere near 20 million games in order to be able to extensively cover the major opening systems likely to be encountered when playing engine users: Closed Ruy/Spanish, Sicilians and various QGD/Slav openings are where most engine games are played.

Saccadic
DeepGreene wrote:

Many databases already include games by engines - Deep Blue, at the very least, in many cases.


That's right. I've come across many engine games in Chess.com's opening database.

atomichicken
costelus wrote:

First of all I do not think that this is a discussion about cheating. I do think it is a very interesting question which might concern many people here. I just saw a modern correspondence chess game: cyborg vs cyborg :) Uhh...

Here is a question: If X and Y play a game here, on chess.com and they both follow completely a game played by two centaurs, I will certainly think that they cheat. Even if they DO NOT get computer evaluations of the position.

I also do not think that it is impossible to build a database consisting of let's say 20 millions of games (Mega I think has 3 millions). With such a database, it is likely that you would encounter (and know the perfect moves) for many positions, very deep into the middlegame. Such a database would offer you a significant advantage over a player who uses a book or the Game Explorer here.


How is this not a discussion about cheating?

costelus
Manchero wrote:

With regards to computer use and subsequent database searching - no problem. It is 100% permitted, and encouraged, to use engines in both preparation and analysis of your games. Running alternative lines through your engine and storing the games in your database is wise and will save analysis time in the future.

What if I take a position after the first 15-20 moves in a very well-known opening (let's say the Marshall attack, where the theory goes even in OTB games ussually up to move 15-20), and I put my engine to analyze it in multiline mode (let's say 10 lines) for the next 10 moves. I store the analysis in a database, and then I come and play.

I think that in this case the point with "your opponent will likely deviate from the line you analized" might not hold: there are not too many options to deviate in Marshall, and, if I analized each of the best possible 10 moves for all the moves from 18 to 28, it is extremely likely that, if my opponent deviates, then he made a serious mistake. In which case I am able to win the game myself, without any computer. 

I think that this topic is far from the usual discussion about cheating you can find in the dedicated topic.


costelus

That could also be an explanation why most of the top players here deviate very early from the know theoretical lines. I was surprised to see such early deviations in correspondence chess, much early than in the OTB games of a low rated player like myself. But now it makes sense: they might be following the games played by cyborgs.

DenverChess

Actually, technically speaking, a database is a collection of data composed of tables. These tables are in turn composed of rows and columns. Every move we make here in every game is a new row of data that adheres to the columns guidelines in the table. In fact, this comment I'm posting right now is a row of data in a table most likely named "Databases - what is allowed?" with columns like username, date, time, message, etc. SOME of the databases here on chess.com are collections of games yes. Let's not forget about the ones that compose the data about chess.com's uses and their posts etc....

If someone here is using a database in play or for assistance in games in any way shape or form, what they are relying on is the accuracy of the database instead of the logic they possess. Computer have NO logic, databases included. Using databases IS cheating and I know of a way to easily defeat them.

Thanks for letting me post.

-DenverChess

costelus

Yes, computers have no logic, only a huge calculation power. Unfortunately, this calculation power is enough to destroy easily any human player (in the last games, Rybka played with a 2600+ GM giving him knight odds and, of course, winning). 

I cannot imagine what a database of 20 millions perfect games played by ICCF cyborgs would look like. I guess I would have no chance against a player using it and he/she would win the game only by looking up the moves. Until the point at which I make a mistake.

There is no easy way to avoid databases. I play correspondence games here only with the hope that it will improve my OTB play. I don't want to play crazy openings like hypo or 1.f3 2. Kf2.

OK, so I guess the solution is not to play at all correspondence, seems that this type of chess has been completely ruined by computers.

Twarter369

I love my DB's! My main has almost 4 million games in it(ok they are not all complete games some are training lines or trap lines ect.) Which I am constantly adding to (thanks to the great people here at chess.com you guys are the greatest). I have run engine tournaments between my two main engines and saved every game. The thing is that, as DenverChess stated above, Engines don't think, they analyize. The problem there is that you don't get to see the trends in play. There are high stress moments in games where a lot of people lose focus and make a bad move, having too many "perfect" or EvE games in your main database will hide these trends and work against you.

EDIT I forgot a point I wanted to make. Someone mentioned that Studying DB's is only one step away from using an engine DURING a game. Thats really not true. Before DB's the greats kept notebooks of handwritten games studied every possible variation they thought held merit. Which compelled them to find an easier way to do that. I think the people that DON'T keep DB's are missing out on a great training tool. if you can't keep from cheating (i.e. using the engine part during a game) then you are missing the point of the game anyway.

costelus

Ok, here is a question I think that Silman asked one of his students who was proud of his database of 4 million games: "What percentage of them did you studied?"

A clear distinction must be made between small databases and books and huge bases containing millions of games. The comparation between such a large database and handwritten notes is simply impossible.

I think that, if you want to push the limits of a database, you could come extremely close to using computers. Let's say I want to participate in a thematic tournament. I know the opening, I download millions of games in that opening played by cyborgs, I also ran my own multiline analysis for all the reasonable moves. I save all the results in a database and I use it. In this way, I am sure that the only moves for which I don't have a calculated line in my database are mistakes (and then I just win the game myself). 

Eastendboy

It seems like you are dreaming up new things to worry about. 

A database filled with engine games is nothing for you to worry about because you'll make a move that's not in the database relatively quickly.  It's simply not possible to have a db with made up of engine games that contain moves that "humans are likely to make".  They don't make moves like that.  They make engine moves.  People make human moves and soon as you make a move that an engine didn't play in one of those games, guess what?  The database game is no longer a valid reference point.  With the exception of small descrepancies in the way that multi-core engines work, they are deterministic which means that they can only play the best move they find.  You know as well anyone how difficult it is for a human to match an engines best move consistently.  The *only* use for the kind of db that you describe is against other users with similar databases or engine users.  Think of it as an anti-engine weapon -- a weapon designed to level the playing field if and only if the person using the db encounters an engine user. 

Eastendboy

One other thing - when you see top players deviating from theory I'd be willing to bet that it's not because they're using a db filled with engine games -- it's because they trust the engine more than the theory!  It's a fatal mistake that many unskilled cheaters make.  It's also the kind of mistake that the best players here capitalize on ruthlessly.  Maybe they're cheaters too, I don't know, but clearly there are some that are better than others if that's the case.  There are a couple top players here that regularly mow down other top players in brutal fashion with up-to-minute opening theory....

costelus

Well, there are not only games played between engines. There are millions of games played between cyborgs on ICCF. And cyborgs can customize their engines, exploring not only the best move, but other moves as well. Thus, what it might look to me as engine-assisted play, may be in fact legitimate play using such a gigantic database.

That's why I think it would be a great idea if the game explorer here would be the only allowed database. Whoever uses anything else, does this at his own risk.

Twarter369

Actually it is very easy to search for pertinant information amongst the ~4 million lines and games. Because most of the games won't apply at any given time. You don't need to read every annotation from every game that is in there either. Only the ones you find interesting. So to answer your rhetorical question, his student should have answered why study one game at a time when I can study them all simultainiously and get a truer perception of the outcome?

DenverChess
Eastendboy wrote:

A database filled with engine games is nothing for you to worry about because you'll make a move that's not in the database relatively quickly.  It's simply not possible to have a db with made up of engine games that contain moves that "humans are likely to make".  They don't make moves like that.  They make engine moves.  People make human moves and soon as you make a move that an engine didn't play in one of those games, guess what?  The database game is no longer a valid reference point. 


Exactly. I couldn't have put it into words any better.

The ability to compute massive amounts of information quickly without being able to apply logic will lose to the entity able to apply logic and calcualte slower almost always, with the exception of when the entity produces errors. Therefore, the person HAS the ability to destroy the computer almost always and not the other way around.

Modern databases, relational or not, are capeable of hosting BILLIONS of rows of information, not just millions. Also, if you know what you're doing with them you can construct a useable database with billions of rows of information on chess games and variations from multiple sorces in a matter of a day or two. This includes games played by novice players, grand masters, and engines/computers. To take it a step further you could construct advanced engines that can actually identify the kind of opponent they are playing be it a human or computer and respond with moves accordingly. I am willing to bet that such an idea has yet to be perfected but attempts have been made and that in the near future such an idea will be in implementation ..... which could mean horrible news most chess players and will present a new challenge to the best players of the game.

After all of this evolves an unbeatable chess entity will emerge. The technology of today and especially of the future will be easily able to manage EVERY move and will be able to execute known tactics in given situations. People and engines alike will only be able to tie such an invention when they play games perfectly!

ozzie_c_cobblepot

Regarding GM Hort's statement, is this not what people said about calculators? Is this perhaps an aversion to technology?

Admittedly, this is exactly how I feel about the GPS. I enjoy remembering how to get somewhere, I may stop to ask a gas station, I may look at a map. But having the GPS would just turn my brain to slush.