Scripts to retrieve PGNs

Sort:
chesslover0003

I recently found https://www.chess.com/forum/view/general/python-script-to-download-entire-game-archive-for-a-specific-user-convert-it-into-a-csv but it doesn't appear to work for me.  I'm unsure if this is because of changes to API endpoints.  I'm going to keep troubleshooting.

In the meantime, do others have suggestions for scripts to retrieve PGNs from the API with a given username?  Ideally, I would like parse and store in CSV format.  Perhaps I'll just do it in two steps: 1) download the games in PGN format, 2) parse into CSV format (there is a script for this that I can model what I'm doing).

Suggestions for either?

Martin_Stahl

https://www.chess.com/announcements/view/published-data-api#pubapi-endpoint-games

Those are the endpoints that have member-based PGN files. As to the above script, if the headers being sent aren't configured, the script is likely being blocked.

https://www.chess.com/announcements/view/breaking-change-user-agent-contact-info-required

chesslover0003

Oh yes, I have the endpoints. And it's the recent API changes that appear to have affected existing scripts. Baby steps. Thanks.

GM_Salzi

i wrote a python skript to download all my games from the api and one to get special statistics out of it, which is not aviable on chess.com (for example a rating history graph not over time, but over game)

you will need two files:

config.json:

{"lastupdate": 1697631206.594295}

the lastupdate value is made for the timestamp, when the python skript was lastly executed. For your use you will need it to set to a time, before the creation of you account. The simplest option is to 0.

Next you need a data.json file. This is for the beginning just a empty json file like this: {}

It will be filled with data by the python skript.

To use the python skript you will have to change the data in the headers for the request and set it to the username of the account you want the games for. I think it should explain itself. If not, feel free to ask me. Let me also know if it helped you, or if there are any improvements to be done.

After executing the skript you should have all your games in the data.json file. And if you play some games and want to refresh data.json, execute the skript again. It will only check the archives that could possibly be modified since the last update.

import json
import requests
from datetime import datetime

with open("./config.json", "r") as file:
config = json.load(file)

with open("./data.json", "r") as file:
filedata = json.load(file)

username = "GM_Salzi"

lastupdate = datetime.utcfromtimestamp(config["lastupdate"])

url = f"https://api.chess.com/pub/player/{username}/games/archives"

response = requests.get(url, headers = {'User-Agent': 'username: myaccount, email: my@emaill'})

data = response.json()

for archive in data["archives"]:
pieces = archive.split("/")
year = int(pieces[-2])
month = int(pieces[-1])

if year > lastupdate.year or (year == lastupdate.year and month >= lastupdate.month):
date = f"{year}/{str(month).zfill(2)}"
print("update", date)

url = f"https://api.chess.com/pub/player/{username}/games/{date}"
response = requests.get(url, headers = {'User-Agent': 'username: myaccount, email: my@emaill'})

filedata[f"{year}/{month}"] = response.json()

config["lastupdate"] = datetime.now().timestamp()

with open("./config.json", "w") as file:
json.dump(config, file)

with open("./data.json", "w") as file:
json.dump(filedata, file)

chesslover0003
GM_Salzi wrote:

i wrote a python skript to download all my games from the api and one to get special statistics out of it, which is not aviable on chess.com (for example a rating history graph not over time, but over game)

Looks very simple. I'll experiment with it. Thank you.

I now see what I need to modify in the script I mentioned above.

The script is downloading PGNs in JSON format. Do you do all your analysis with the data while it's in JSON? Or are you importing JSON to CSV or relational DB? I know it's also possible to download in PGN format.

chesslover0003

Next step may be to get the data into a CSV format that can later be used as part of a DB schema.

## Chesscom Games Exporter, v 0.0.2 (beta)
## This is a python script for retrieving all games for a user and storing in JSON
## This script creates output file games.json.
## You will need to modify the script to specify the username and modify header

import requests
import json

headers = {
'User-Agent': 'username: chesslover0003, email: b@gmail.com'
}

# Retrieve list of game archives. Store games in JSON.
all_data = []
for url in requests.get("https://api.chess.com/pub/player/chesslover0003/games/archives", headers=headers).json()["archives"]:
all_data.append(requests.get(url, headers=headers).json())

# Updated parsing to create all games in a single dictionary
games = []
for row in all_data:
for game in row["games"]:
games.append(game)

# Output results to file
with open("games2.json", "w") as file:
json.dump({"games": games},file, indent=4)

EverydayRonin

Not quite what you're looking for, but I have an open-source tool that does this and stores it to an SQLite database. Feel free to use it as a reference and save to a csv instead.

https://github.com/EndlessTrax/pgn-to-sqlite

chesslover0003
EndlessTrax wrote:

Not quite what you're looking for, but I have an open-source tool that does this and stores it to an SQLite database. Feel free to use it as a reference and save to a csv instead.

https://github.com/EndlessTrax/pgn-to-sqlite

Thank you. Actually... I am now retrieving all games from chess.com API and saving as JSON. I also save PGN version of just the PGN data.

I'm working on storing the data in SQLite now. I'll check out what you have. Do you have sample SQLite file that I can see the schema?

chesslover0003

I think I just sent you an email. lol. I was looking at this repository as you sent this.

LateToMate

May not be helpful at this point, but just sharing this for awareness:

https://pypi.org/project/chess.com/

chesslover0003
LateToMate wrote:

May not be helpful at this point, but just sharing this for awareness:

https://pypi.org/project/chess.com/

Thank you. I like the rate limiter you included. Last thing I want to do is DoS the server.

LateToMate

To be clear, it's not mine - I have just found it helpful in the past.