Forums

Databases: the transposition problem

Sort:
chesster3145

I understand that it's the positions that matter, not the move order, but some are, frankly, ridiculous.

One example:

Entering the moves 1. e4 e5 2. Nf3 Nc6 3. Bb5 a6 4. Ba4 Nb8 on Chess Tempo.

Any thoughts on this?

wayne_thomas

I believe this is known as the Hyperaccelerated Breyer variation.

Magnus Carlsen played 1.e4 Nf6 2.e5 Ng8 against Laurent Fressinet fairly recently, and went on to win.

JSLigon

So after 4... Nb8 white can choose between Bb5 and Ng1. These don't look like good moves to me, but then I'm not a grandmaster. Bb5 is apparently the best move since it scores 75% in high level games, and then for very subtle reasons I don't understand, GMs know better than to capture the bishop, preferring 5. Bb5 Nc6.

To be serious: Yes, this is definitely a problem. It just seems like the database wasn't implemented correctly. I get why 5... Nc6 comes up since it leads back to the standard Ruy Lopez position, but 5. Bb5 is just a blunder and the resulting position shouldn't be in the database. I have no idea how that would have happened.

JSLigon

And chess.com's own Opening Explorer also features 5. Bb5, showing it as white's only move, played four times with white winning all of them. I don't know what's going on here but somebody, please fix this!

tmkroll

This is normal behavior for databases. They're looking at positions and not move orders in order to take account of transpositions. If you want to find out if a piece was en prise just look at the games themselves. Those move orders don't actually occur.

JSLigon

I'll post the problematic move sequence below, for reference. There is no justification for the position after 5. Bb5 being included in the database. No game between competent players would reach that position. Once that position gets into the database, I can easily understand how 5... Nc6 shows up as by far the most common reply, since it leads to a standard position in the Ruy Lopez. But regardless of move sequence, there is no way two competent players would reach the position after 5. Bb5, which can only be reached if white makes an obvious blunder. Either there was a transcription error when entering games into the database, or some kind of flaw in the design of the database itself.

tmkroll

Well I don't have the opening explorer unlocked that far. Why don't you look at the games that reach that position if you're interested in diagnosing the problem?

JSLigon

Good suggestion! For some reason I didn't think to do that. Four games came up in the results:

Geller - Korobov 2003

Fedorchuk - Delorme 2004

Fedorchuk - Maiorov 2004

Swiercz - Yandemirov 2014

All four games feature white absurdly hanging the bishop on b5. Somebody must have entered in the game scores incorrectly.

 
 
 
Martin_Stahl
JSLigon wrote:

I'll post the problematic move sequence below, for reference. There is no justification for the position after 5. Bb5 being included in the database. No game between competent players would reach that position. Once that position gets into the database, I can easily understand how 5... Nc6 shows up as by far the most common reply, since it leads to a standard position in the Ruy Lopez. But regardless of move sequence, there is no way two competent players would reach the position after 5. Bb5, which can only be reached if white makes an obvious blunder. Either there was a transcription error when entering games into the database, or some kind of flaw in the design of the database itself.

 

In this particular line, it does look like the database may have some consistency issues.  Someone would need to look up the 4 games that have the Bb5 move to see if the games are actually incorrect or what.

 

https://www.chess.com/games/search?f=181341037

 

It is unfortunate, but databases with millions of games will have errors and the Explorer functions are just displaying the games as they are in the DB.

 

I would imagine at least 1% of the games in the database have some kind of move/notation error and often those come from the site that originally posted the scores. I run into similar issues adding games to my databases. Some I can fix easily, some I can't.

chesster3145

I got bumped! Lol.

chesster3145

I wonder if there are any other inconsistencies.

macer75
chesster3145 wrote:

I got bumped! Lol.

Did the person who bumped you apologize?

chesster3145

Interestingly, in the Canal-Sokolsky with 3... Bd7, someone searched up 4. Ba6?? and 4. Bc4 Bh3?? on database.chessbase.com. Is there a Rut Lopez expert on the loose?

JSLigon

Here's the search:

https://www.chess.com/games/search?f=68496317

 

Links to the games:

https://www.chess.com/games/view/13350089

https://www.chess.com/games/view/1266353

https://www.chess.com/games/view/4120720

https://www.chess.com/games/view/1203642

 

Read the comments on the second and third of the linked games. A few people had a look at these games and noticed the problem.

fieldsofforce

Martin_Stahl wrote:   I run into similar issues adding games to my databases. Some I can fix easily, some I can't.

The best check is the computer's opening  tree automatic check. 

chesster3145

And 4. Bxd7+ Kxd7!?!? was played once. Maybe a Kholmov impersonator?

Martin_Stahl

Here is Fedorchuk vs Maiorov from a different DB:

 

xman720

My copy of stockfish relies too much on its opening book:

dpnorman

ChessBase solves this problem by telling you how many games arrive at a position via the move order you played, and also how many transpositions

ModestAndPolite

You can waste your whole life lamenting about things that are not as they "ought" to be.