Converting a name to an immutable ID

Sort:
WhiteDrake

I’m sure a lot of you have already been solving this challenge. What’s the best way to assign a name (of a member or of a club) to an immutable identifier of any kind? I mean, club names may change and members can change their names too. If I cache some data (e.g. how many team matches a member has tried to sign up for) and then a name gets changed, how easily find out that I already have some data cached for that name?

In theory, a good sollution would be to convert the current name of a member (or club) to an integer, which doesn’t need to hold any meaning, just be unique and immutable (hence the title of the thread). Some sort of a primary key candidate. Members register to Chess.com sequentially (and clubs are created in sequential order too), so the number in the registration order would be a candidate, if it’s accessible somehow.

So, what’s currently the best way to do this? 

stephen_33

But both members & clubs have immutable id's, so that isn't really a problem - simply use the relevant id from the endpoint.

For example:-

https://api.chess.com/pub/player/whitedrake

{"avatar":"https://images.chesscomfiles.com/uploads/v1/user/37499010.39de8711.200x200o.552c20f23cf0.jpeg","player_id":37499010,"@id":"https://api.chess.com/pub/player/whitedrake","url":"https://www.chess.com/member/WhiteDrake","name":"Jan","username":"whitedrake","followers":35,"country":"https://api.chess.com/pub/country/CZ","location":"Prague","last_online":1553504848,"joined":1501682199,"status":"premium","is_streamer":false}

Your player id never changes & will follow you until your account is closed.

WhiteDrake

Oh, I missed that! So that was easy, thanks. happy.png

 

stephen_33

For clubs, much the same:-

https://api.chess.com/pub/club/black-stone-hq

{"@id":"https://api.chess.com/pub/club/black-stone-hq","name":"Black Stone HQ","club_id":57928,"country":"https://api.chess.com/pub/country/XX","created":1500927620,"last_activity":1534200280,"admin":["https://api.chess.com/pub/player/bsaeagle60","https://api.chess.com/pub/player/szaszzo66","https://api.chess.com/pub/player/empr14","https://api.chess.com/pub/player/whitedrake","https://api.chess.com/pub/player/pawnlings","https://api.chess.com/pub/player/mrmoney1","https://api.chess.com/pub/player/baseballnut"],"visibility":"private","join_request":"https://www.chess.com/club/join/57928","icon":"https://images.chesscomfiles.com/uploads/v1/group/57928.0d1d9f76.50x50o.352d6a9ac3c5.jpeg","description":"<p>For admin of Black Stone</p>","url":"https://www.chess.com/club/black-stone-hq"}
WhiteDrake

Yeah, RTFM, right? meh.png Well, I overlooked this second id. Thank you.

 

stephen_33

The real problem is the converse situation - trying to convert an immutable identifier of a member or club into its corresponding username or club name. There's no way at present to access endpoints via id's & that would be an immensely useful improvement.

My scripts crash to a halt from time to time because some club in one of the leagues I help to run has changed its name without anyone letting me know. It doesn't make much sense to use, as the only reference for a club, a name that can be changed ten times in a day!

So again, if any of the developers are reading this, can we please have a way of accessing match endpoints (in particular) with a fixed id or reference of some kind?

stephen_33
WhiteDrake wrote:

Yeah, RTFM, right? Well, I overlooked this second id. Thank you.

Ha, had to look that up but you're most welcome  happy.png

WhiteDrake

Yes, that would definitely be useful.

 

stephen_33

I think the ability to simply convert a club_id to its corresponding club name &/or its url_id would be enough for most needs.

skelos
stephen_33 wrote:

The real problem is the converse situation - trying to convert an immutable identifier of a member or club into its corresponding username or club name. There's no way at present to access endpoints via id's & that would be an immensely useful improvement.

...

 

Hear hear. Lookup by ID is frankly essential long term.

I want access to a game PGN by ID too, without having to scrape the website.

Those three lookups:

  1. Club by club_id
  2. Player by player_id
  3. PGN from game ID

are very rapidly rising to the top of my wish list.

Caching much of anything isn't very useful without #1 and #2. Yes, I can determine #1 and #2 for any data I have and when I get a SQL database set up even manage to store PGN data and match data without embedded names (which do change) but the need to do so is a PITA.

I've about reached (honestly, am past) what I can do with api.chess.com without very careful caching and normalisation of the non-normalised data supplied by the endpoints.

I am not suggesting the endpoints supply normalised data (which would require more lookups, and network latency of request/result is a real problem) but wherever possible providing "permanent" IDs as well as "current" IDs would be very helpful, and longer term access-by-ID is needed.

bcurtis

This is more difficult than it sounds, because we are reaching the limits of integer IDs and so we are in the process of changing these to something we won't need to change ever. This process is going slower than expected, and when we designed these endpoints we thought we'd be able to inject these later.

I'll see what we can do about getting those lookups. We may implement these as redirects from the ID to the canonical URL.

@skelos, can you tell me the process you use that gives you a Game ID but does not also give you the other data you need?

skelos
bcurtis wrote:

...

@skelos, can you tell me the process you use that gives you a Game ID but does not also give you the other data you need?

I have a stored player_id and the player has changed name, so I'd like to be able to find them by player_id.

If I can't find them by player_id, if I have a saved game with an ID in it (maybe from the Link: header inserted into the PGN, perhaps I have the JSON) I would like to be able to look up the game by that ID and determine the new name. (Theoretically, if I have a game saved and I'd like the current names of the players, although that's not something I'm doing now.) I do lookup old games on the website to find changed player names.

Clubs change names, so to nicely report their current name I would like club-by-id as well. [Edit: And to avoid invalidating cached club data ... I don't have much of that yet but I will have more. If I store matches normalised at least to club_id and player_id then they're safe once they're finished and I don't then have to worry about name changes for players or clubs.]

 

Somewhere there's the ideal line between normalising data to help caching and keeping things simple. Maybe. wink.png

bcurtis

So the game ID lookup is needed also because the players can change usernames?

skelos

My initial request was for three ID based lookups: player, game, and club. If you give me game I can for most purposes work back to player_id.

Lookup by club_id I've the least justification for.

Thus while I do still want all three, the priority for me is:

1. game by ID

2. account by player_id

3. club by club_id.

 

Players do change usernames. I got one closed recently who'd changed between reports.

Worse (for caching) an "old" changed username can be reused by someone else, and I was shown an example of that by someone who'd changed his name and seen the old name in use. @Wind is the new account name, but I forget his old name.

skelos

Addendum to post #14:

#1 and #2 could be swapped in priority order if I make my caching more sophisticated, and I plan to.

But right now I store "as-is" game archives under a player's name, so a lookup by game ID (which I do manually via the website) is what I do when someone's name changes. So #1 seems most generally useful (but is likely new work for you) but #2 while also new would be another way to get a profile, and not introduce a "single game" endpoint which we don't have currently.

Of course, as there are people with numeric account names which doubtless overlap (or could) with player_id simply using the profile endpoint and selecting account or ID won't work; it'll have to be a new endpoint or only be "mostly right".

bcurtis

Yes, that's my point: player and club solutions are easy. A game ID solution is not, and it sounds like the only time you have a game ID and want to look up that game's data is because you actually cannot reach the archive due to the player changing names. To me, this sounds like the player ID lookup is adequate and the game ID look up is extra.

Right?

And yes, the problem of all-numeric usernames is maddening.

WhiteDrake
bcurtis wrote:

This is more difficult than it sounds, because we are reaching the limits of integer IDs and so we are in the process of changing these to something we won't need to change ever.

Can’t you just move to 64 bit integer IDs?

bcurtis wrote:

And yes, the problem of all-numeric usernames is maddening.

Could this be solved by a new endpoint? Like /pub/player/byid/{player_id}.

skelos
bcurtis wrote:

Yes, that's my point: player and club solutions are easy. A game ID solution is not, and it sounds like the only time you have a game ID and want to look up that game's data is because you actually cannot reach the archive due to the player changing names. To me, this sounds like the player ID lookup is adequate and the game ID look up is extra.

Right?

...

I think so. I've not done a lot with completed tournaments or matches, but they both should link to all games played one way or another, so providing I do record player_id I should be fine to handle name changes and especially name reuse.

Ditto for club_id.

Thanks. Fingers crossed now!

bcurtis
WhiteDrake ha scritto:
bcurtis wrote:

This is more difficult than it sounds, because we are reaching the limits of integer IDs ...

Can’t you just move to 64 bit integer IDs?

The problems we are hitting are more about sequential IDs than the fact that they are integers. I described it as a limit of integer IDs because that is the visible change we are making, moving away from integers to something non-sequential.

 

WhiteDrake ha scritto:
bcurtis wrote:
 

And yes, the problem of all-numeric usernames is maddening.

Could this be solved by a new endpoint? Like /pub/player/byid/{player_id}.

In this case, yes! But there are many other issues that crop up from time to time, and there are 120,000 of these so they aren't going away.

stephen_33

Does that also apply to club id's? Some means of using an unchanging parameter, such as the club_id, to access club endpoint data would be extremely useful, as skelos & I have emphasised more than once.

The frequency with which some clubs amend their club names gives some of us a real headache at times. It used to be the case that members could change their usernames only once but a hyper-active admin can change their club name a hundred times in a day  sad.png