Is there any way to query the names of past members?

Sort:
ian-rastall

I'm trying to build backward on to a series of databases that start at 2021, that someone else has put together. One PGN database for each month, approx. 3-4GB each, starting at 1950 elo. I was able to query for all the names of titled players, and along with just brute-forcing the copy and paste thing with the leaderboard, gathered about 21,000 names that stretch back to 2300 elo. (Since my own project using that series of big PGNs already filters starting at 2300, that works just fine.) I built query URLs for each name and downloaded them one-at-a-time in a download manager. Only half of these URLs returned PGNs for 2020-12 and 2020-11, and I expect the problem will only get worse as I go further back in time.

Is there any way to find out who used to be a member here? I realize that many of the names would belong to banned players, but also just people who've decided to go elsewhere. It would be pretty hard to do it manually, and I can't think of how to do that anyway.

sjbfan

Yes you can first use the profile endpoint https://api.chess.com/pub/player/sjbfan and get the "joined" timestamp. Then at least you'll know which accounts existed when. But that doesn't solve the problem that some people may have created an account but may not have archives for specific months because they haven't played.

The best way to solve this would just be to get the available archives first for each username, then you know for sure they exist and then download them. https://api.chess.com/pub/player/{username}/games/archives

ian-rastall

I need to explain myself more fully. I'm looking for past accounts. It's not going to get me very far to look at the leaderboard for 2024-07 and apply those names to 2020-12. Many of them hadn't gotten here yet, and many names have dropped off since then that would no longer be on the leaderboard.

All I'm looking for is a list of names, so I can build URLs for each month. I do realize that some of those blank PGNs are due to the fact that that player hadn't been playing that particular month. But there must also be a ton of players who have fallen through the cracks. The only way I can think of doing it right now is to build what I can of the 2020-12 PGN database (which is still pretty substantial) and then list all the player names for every game, take out all the duplicates, and theoretically I'll start getting names of high-rated players from back then, regardless of whether or not they appear on the leaderboard anymore.

ImperfectAge

Does the unique player_id field help? You would probably need to have all the old ones already recorded.

https://www.chess.com/news/view/published-data-api#pubapi-endpoint-player

ian-rastall

In case anyone is curious, this is how you build a Chesscom Elite Database to rival the Lichess Elite Database. You strip the names out of the giant PGNs you build. The more games you collect, the more names you can strip out. My way right now is to use Ordoprep to simplify the PGNs, and then simple regular expressions to make just a list of names. From there, you can build a bigger db and get more names. Eventually you will have a list of usernames that are all attached to highly-rated players.