Download PGNs filtering by rating and time control, not username

Sort:
AManMoth

Hi,

I've managed to use your API to download the games of a specific user. Is there a way to download games of any user but within certain rating ranges or playing under specific time controls? Or is my best bet just to input random usernames then filter the results afterwards?

Thanks 

andreamorandini

Hi @AManMoth at this moment the only way to get PGNs is through the player username. Getting PGNs, even if filtered, from all players would be impossible to manage.

 

 

AManMoth

Ahh that's cool, thanks for you help! happy.png 

habersame123

for a specific user is there a way to only download rapid games but not bullet or bughouse for example?  thx

habersame123

let me rephrase this, i'm a novice with programmming. what i want to do i to be able to download allthe games of a player that are standard chess rapid, or standard chess blitz.  when i use the monthly archives i get all the games the player played in that month, including bughouse, bullet etc. is there a way to limit that monthly download to just blitz or rapid standard games through going to an http link .  i am just starting to learn  python, but a hrl to do this would be much easier. i believe lichess has this functionality..   thx for your help

skelos

There is no way presently to select games by type. It's a reasonable enhancement request, but unless you are particularly short of bandwidth I suspect it would be a low priority. Very occasionally I hit a player whose games are slow to download due to very many bullet and/or blitz games, but it's rare.

skelos

If you're seeking to cut down how much data you download, you could count the type of game you want as you download it, and not download any additional months. I've thought of doing that for my download script but so far have not been motivated due to lack of need.

Currently I've ~3.2GB of downloaded games (that is, 3.2GB gzipped). Of that, ~10 players hit 10MB+ and the largest is 53MB.

habersame123

it's not the amount of data that i'm looking to cut, it more that when i analyze games, bullet games are less significant than blitz or rapid games in terms of understanding performance or playing style.  bullet is rather meaningless.  so when i pull it into a database program i only want blitz and rapid.   if i download everything for a month there is no way to separate the different time controls in chessbase.  in fact it's hard to separate bughouse or crazyhouse from standard, but the bigger issue is separating bullett and discarding those games.

skelos

Ah ... I download the JSON rather than the direct PGN, and then pick out the games I want by variant and time control: chess_daily, chess960_daily, chess_blitz ... etc.

Of course, sometimes I want to select particular blitz time controls; my previous script could do that an I'll teach my current script how to one day. happy.png

skelos

It's not common but not rare either for me to want a different time control after my first analysis. If chess_daily looks clean, then maybe 960 or a faster time control is worth a look. Most of my downloads are for cheat hunting, and while I've 200+ (that I can name) accounts closed I have a lot more than that that I've looked at and decided were fine, or sometimes suspicious, or sometimes reportable but the cheat detection team didn't/doesn't consider the evidence conclusive. (At least once they've caught an error of mine, so I'm happy they're careful.)

habersame123

you pick out the games manually from the json?  i dont' think i can import json into chessbase though.  on lichess it's possible to download just rapid games from the api.  it's something chess.com should consider adding b/c the data is all there and i would think ti's relatively easy for programmers

skelos

I don't have/use ChessBase but would expect to stage data I imported to it anyway, so yes, I pluck the "pgn" portion of the JSON for games I want.

Basically:

  1. Download to local storage, files are JSON gzipped
  2. Run a selection tool against the file(s) to choose what games I want (time control, match, tournament, minimum opponent rating, maximum opponent rating, tournament games, match games, "Let's Play" games.

I can also select particular opponent(s) or exclude them ... for what I do that's useful. For something headed for ChessBase, I'd simply filter on time controls (perhaps more finely than "blitz" if I cared about 3|2 being different from 10|0).

 

Horses for courses. wink.png

skelos

I want to process the PGNs I extract anyway; drop book moves (and have a FEN as the starting position) and cut down on endgame confusion and long game statistics distortion by chopping moves once a position reaches seven pieces on the board.

skelos

Sorry for so many posts, but part of the difference is perspective: I've spent a lot of time in the IT industry, most of it working with code either in development or support, so what I consider "not too hard" (even if annoying!) in order to get the flexibility I want may not be so easy for someone new to programming. But do ask questions; people here are generally willing to help if they can and range from professional programmers to those who'd not written a line of code until they had something they wanted to do via api.chess.com, so don't be discouraged. Years of experience are not necessary. Many of my years of experience are with totally obsolete and utterly unused languages and systems, thus of very little use in 2019!

habersame123

thank you for your help. i will need to look at this this weekend. i'm not totally sure how to get the json file, but mayb ei can figure that out.  what do you use as a "selection tool" would that be a program or line of code?   thx, sorry i am very new to this.  

skelos

I use this endpoint to find what archives are available:

https://www.chess.com/news/view/published-data-api#pubapi-endpoint-games-archive-list

... and then this one to download them, as JSON:

https://www.chess.com/news/view/published-data-api#pubapi-endpoint-games-archive

 

While I use Perl, Python is possibly better; I like its "requests" (non-core) library very much.

 

Whatever language you use, providing it can parse JSON, there are lots of useful details:

...

"time_control": "string", // PGN-compliant time control "rules": "string", // game variant information (e.g., "chess960") ...
"tournament": "string", //URL pointing to tournament (if available), "match": "string", //URL pointing to team match (if available)

...

 

I hope this helps ... and do feel free to ask more questions. Trying to anticipate every question would lead to lots of noise in the thread.

 

Cheers,

Giles

mab2003

sorry how do you parse a json?

skelos
mab2003 wrote:

sorry how do you parse a json?

Pretty much any programming language has a module or library for parsing JSON.

A web search will be your friend, but for Python see https://docs.python.org/3/library/json.html, for perl JSON:: PP (sans whitespace) or JSON:: XS (again sans whitespace).