Anyone Else Unaware Of This?

Sort:
stephen_33

I've just posted this in OD but it seems even more appropriate to this club? ....

https://www.strongdm.com/what-is/chess-com-data-breach#:~:text=On%20November%208%2C%202023%2C%20hackers,auditing%20systems%20for%20malicious%20access

"Has chess com ever been hacked?
On November 8, 2023, hackers under the username DrOne leaked a database containing the personal information of over 800,000 Chess.com users. While it affected only a small fraction of the company's 150 million members, the Chess.com data breach still demonstrates the importance of auditing systems for malicious access.4 Feb 2025.

How Did It Happen?
The Chess.com breach happened due to unauthorized data scraping from its public API. According to Chess.com's statement to Hackread, malicious actors exploited the “Find Friends” feature in the platform’s API, which allowed access to publicly available user data. They then collected members' data from the Chess.com profiles and leaked them to the dark web."

.
I came across that just now by accident but I don't remember the site making us aware of this but maybe it issued an announcement somewhere that was soon forgotten?

I've a feeling that was around the time we were required to provide a client-id for the first time when making endpoint requests. The site never really explained why that change was needed so urgently as far as I can remember - does anyone remember this being explained?

I also don't remember the “Find Friends” feature - what was that?

Magpie_0-0

.... 💀

Kusanali

This is old news. he scraped a public API, and all the leaked data are low-risk, such as email, UUID, name, country ID, member URL, avatar URL, and username.
The only slightly more sensitive information he obtained was email addresses, which he found using the 'Find Friends by Email' feature. Your password is safe, so there’s nothing to worry about

Martin_Stahl
Rosaria_MoR wrote:

This is old news. he scraped a public API, and all the leaked data are low-risk, such as email, UUID, name, country ID, member URL, avatar URL, and username.
The only slightly more sensitive information he obtained was email addresses, which he found using the 'Find Friends by Email' feature. Your password is safe, so there’s nothing to worry about

The emails were not scraped from the website. The person responsible already had email addresses from another location and just used the find friends feature to link those emails to existing accounts here and then grabbed public facing information to make it look like the site was exploited.

stephen_33

I don't remember the find friends feature so how did that work - did you have to enter a valid email address and the related chess.com username was returned?

Is it known if any site members have been approached, even scammed, by the use of that data?

stephen_33

But I'm guessing the reason why the following change was brought in with such haste was due to that data-mining expedition and the need to make access a lot more restricted ...

(Jun 20, 2023): Breaking Change: User-Agent Contact Info Required

It was never explained at the time and the lack of information has always been puzzling.

Martin, it seemed to some of us that even you hadn't been told why it had to be implemented in such a rush, so much so that you didn't have the chance to issue even an announcement in advance of the change - is that a fair description of what happened?

Martin_Stahl
stephen_33 wrote:

I don't remember the find friends feature so how did that work - did you have to enter a valid email address and the related chess.com username was returned?

Is it known if any site members have been approached, even scammed, by the use of that data?

The friend page had a way to find friends using emails but that's been removed. Can't say if anyone was contacted using that information to try and social engineer additional data

I don't believe the user agent change had anything to do with that.

Crick3t
stephen_33 wrote:

But I'm guessing the reason why the following change was brought in with such haste was due to that data-mining expedition and the need to make access a lot more restricted ...

(Jun 20, 2023): Breaking Change: User-Agent Contact Info Required

It was never explained at the time and the lack of information has always been puzzling.

Martin, it seemed to some of us that even you hadn't been told why it had to be implemented in such a rush, so much so that you didn't have the chance to issue even an announcement in advance of the change - is that a fair description of what happened?

My guess was that they tried to cut down the traffic. 
Near that time we had server overload errors daily.
Probably there were many requests from random sources scraping the API non stop, so they just implemented something to filter and block them.

stephen_33

"My guess was that they tried to cut down the traffic" - might need explaining? In what way would you expect the strict enforcement of "User-Agent" ID's to help reduce traffic?

I've long believed the problem with overloaded servers was/is due to under investment in server capacity and had very little to do with API activity, although I'd be happy to be proved wrong.

I can't remember any member of staff claiming that was a particular cause of overload either.

Martin_Stahl

My understanding of the user-agent with contact details was to have a mechanism to contact developers that might be creating high loads and then stop anyone without that.

stephen_33

Since all endpoints are cached, so no processing is required by the servers, the only load placed on the site's system is in sending the data. I wouldn't have thought that would place a great strain on the servers.

Martin_Stahl

There's still load on whatever servers serve the endpoints. I don't know if the database and underlying server is dedicated to the servicing the API or may be shared with other services.

Crick3t
stephen_33 wrote:

"My guess was that they tried to cut down the traffic" - might need explaining? In what way would you expect the strict enforcement of "User-Agent" ID's to help reduce traffic?

I have my own webservers. Sometimes there are thousands of requests from spam bots, web engines, AI scrapers, etc. in matter of seconds. You can have the best caching on earth, it will have an impact and the API is not the top priority for a live chess website I think.

If you have a user agent set, you can filter out the noise. Or if there is a misbehaving tool running too many queries you can block it and contact the owner.

(but obviously I am just guessing here putting together the pieces as they never confirmed anything)