It already exists:
Scid uses it's own database format that allows for quickly search position, material imbalances, pawn structures, etc.
It already exists:
Scid uses it's own database format that allows for quickly search position, material imbalances, pawn structures, etc.
Hi temp, thanks for the input. I only recently became aware of SCID. I haven't fully come to understand the power of SCID but I think a SQL database should still offer a few advantages. For example, a querry in SQL is fully customizable and should allow for very detailed analyses. I'm not sure that I can fully customize my querries in SCID?
SQL is a horribly inefficient way to store chess games. Sorry. I spent some time researching this recently; there are definitely multiple Ph.D.'s available for people who want to improve the state of the art.
As I'm not looking for a Ph.D. topic, I kinda gave up on my project.
I wish you luck, but I'll be very, very surprised if SQL turns out to be your answer. SQL queries backed by some other (very custom) database engine, possibly. But as I said, Ph.D. material.
All IMHO, naturally.
Hi temp, thanks for the input. I only recently became aware of SCID. I haven't fully come to understand the power of SCID but I think a SQL database should still offer a few advantages. For example, a querry in SQL is fully customizable and should allow for very detailed analyses. I'm not sure that I can fully customize my querries in SCID?
You can pretty much ask SCID anything. The quality of the result depends on the quality of the PGN (which should ideally include complete information on the game).
I'm sure other databases also have fully customized queries, once the pgn has been converted to their format.
Give an example of a query you think SCID wont do?
I realize this is an old thread, but ...
Give an example of a query you think SCID wont do?
The issue is not only functionality, but accessibility of a database from tools other than SCID. There is a huge number of tools that know how to work with a SQL database and can be used for querying, analyzing and updating a SQL database. SCID has its own scripting language with a much higher barrier to entry for someone who just wants to play with the data (without necessarily even looking into the contents of chess games).
When I started this thread, I wasn't aware that SCID existed. Unfortunately, other things have come up and I haven't done much research into SCID's capabilities.
Some examples might be:
Just a few ideas off the top of my head that could be interesting to investigate but would require a fairly detailed database / advanced querying engine.
I'm very perplexed by the chess community's insistence on using .pgn files to store game information. These files are not database friendly and cannot be easily searched or analyzed.
I propose that we, as a community of chess players, build a free, open, and online chess database. We can warehouse all of the important historical chess games in our database. We should use a format that's both ubiquitous and easy to use (I suggest SQL).
This database would both serve as an historical account of the chess world and as a tool for research. First, as an historical account of the chess world, this database would catalog information regarding the masters of the game and the games that they played. Second, as a tool for research, the database should be easily "analyzable". A SQL database allows for far more detailed queries than a .pgn database, and as such, allows for much more detailed analyses.
To this end, I've created a SQL database of 1.74 million chess games. The database is in its raw form now, and needs to be organized. I can't take credit for the data within the database, as I started with 'Rebel's' and Ed Schroeder's .pgn database: http://www.top-5000.nl/pgn.htm
Finally, I've created a blog to track the progress and act as a landing page for the community. From this blog, you can download my SQL database. http://chessdata.wordpress.com/
I need your help! Right now, I'm at the very early stages of building this and need brainstorming ideas. Let's get a firm grasp of what we want and how we're going to do it. I've laid out my plan on my blog; feel free to comment. Next, I need help gathering and cleaning data. We have 1.74 million games already, but the data isn't pretty by any means. I also need help designing a database (I need people with database experience and "chess people" who know what kind of information we should be storing and where we can get it). Finally, I'd like to put this somewhere so everyone can access it (anyone know how to build a webpage?).
People of the chess world, UNITE! :)
This is a cross post (by me) on reddit.com/r/chess
http://www.reddit.com/r/chess/comments/1559f8/lets_create_a_free_open_chess_database/