High quality chess databases?

fncll

There are scores of projects around the web-- most dead-- that seem to want to create the largest possible database of games. Same with the latest mega- and huge- and ultra-super-duper-jumbo databases that various entities sell (of which I own a few, and all are littered with errors). 

Are there any HIGH QUALITY databases? By which I mean databases that are not only fully de-duped, but also have normalized player's names, have ratings for all players, only complete tournaments, etc etc etc?

I'm willing to pay for such a thing if I could do so and avoid laboriously creating one myself!

jtt96

I'll track this thread so I can learn of a good database.

Cystem_Phailure

The massive databases can be good and bad, that's for sure.  It's irritating to finally find a game that's actualy got the position you're looking for, only to learn that it was Joe Schmuck (1109) v. Jean-Pierre d'Tard (1083) (1/2-1/2).  I guess one can use filters, but still, some initial quality control on what's included in the database from the start would be nice.

WilsonRS

I  use openingmaster to get my big database and weekly update it with TWIC's on Chessbase Light 2009. Its a membership thing for access to download the databases but it has the largest databases on the web with 6 million games for the big one or you can have specialized databases as well like top 100 FIDE, 2500+, correspondence, and some online games as well.

ChessMarkstheSpot

  I have almost a 4 million game database that i have collected through many sites on the net, from all of the world championships, 60+ complete career PGNs of the greatest chess players ever, and a whole bunch of other stuff. I may not look at all of that in one lifetime but you just never know.  Laughing

   -Mark

rigamagician

Chessbase's Mega Database 2011 and Big Database 2011 have standardized player names and no duplicates.  ChessOK's Hugebase is also probably pretty good although I've used it less.  Opening Master hadn't standardized all their player names the last time I saw it, but it covers many of the same games as Chessbase and ChessOK.

blake78613

I started to seriously question the value of databases when I found one of my CC games on a collection of Alapin French games.  I shudder to think that someone would play one of my moves based on a database search.

I also found a game from a Virgin Islands under 10 years old championship.  White played the same move I did making it the most popular move.

Estragon

Chessbase keeps their database well trimmed, but there is no such thing as perfect because sometimes games are incorrectly reported in the first place.  Garbage in, garbage out, as they say.

But when you are dealing with millions of games, how important is it that they've all been checked?

rooperi

The only thing with excluding unrated games is that you exclude almost everybody up to and including Alekhine.....

I have about 1.19 million games, my sources are the bases at pgnmentor, and updated weekly with twics.

I have hardly any really rubbish games.

Cystem_Phailure

Yeah, I go to pgnmentor when I want a chunk of games for a particular opening.  Most recently I grabbed their Benoni games when this month's Benoni tourney started here on chess.com.  And I try to remember to get twics updates, but I haven't managed to make it a routine yet.

WilsonRS

I agree OpeningMaster can be a little messy and I do find Chessbase's databases very nice but they are also very expensive and I'm on a tight budget as it is. I absolutely love the TWIC's, making it easy to get up to date on recent games. Is the TWIC free on Chessbase Light 2009? I don't know because I have a premium key and never noticed the option before activating my copy.

Hypocrism
wilsonyiuwahwong wrote:

I agree OpeningMaster can be a little messy and I do find Chessbase's databases very nice but they are also very expensive and I'm on a tight budget as it is. I absolutely love the TWIC's, making it easy to get up to date on recent games. Is the TWIC free on Chessbase Light 2009? I don't know because I have a premium key and never noticed the option before activating my copy.


TWIC records  are free to download on the TWIC site:  http://www.chess.co.uk/twic/twic

 

The function in Chessbase downloads the most recent issues of TWIC and you can copy these into your DB.

BorgQueen

I truly wish there were a free huge, high quality database available!

rooperi
BorgQueen wrote:

I truly wish there were a free huge, high quality database available!


Well, what do you mean by huge, and what do you mean by high quality?

How many "High quality" games have ever been played? I doubt it's more than a few million......

rigamagician

By way of illustration, a high quality game from Chessbase.  Note how all the various fields are filled out, a sure sign of the time and care that went into producing this.

rigamagician

A low quality game from the University of Pittsburgh chess club archives.  Note that in 1949, Gulko was 2 years old, and Tal was 13.  That may help to explain the low quality of this game. Wink

jtt96

maybe Tal was still working on his combinitorial technique.

His opponents could hav eused some chess engines.

rigamagician

Yeah, Gulko played pretty well for a two year old, so perhaps the date given in the UPitt archive is wrong.

NickYoung5

Are there any programmatic interfaces to e.g., Chessbase? If one were so inclined, would it be possible to write a SQL (or similar query) and update, say, all instances of Korchnoj, Viktor to Korchnoi, Victor?

rooperi
NickYoung5 wrote:

Are there any programmatic interfaces to e.g., Chessbase? If one were so inclined, would it be possible to write a SQL (or similar query) and update, say, all instances of Korchnoj, Viktor to Korchnoi, Victor?


I Know SCID (free, remember? ")) has a name checker.

It works, I remember changing a lot of Nakamura, H and Nakamura Hi to Nakamura, Hikaru.