Insane amount of api requests [Problem]

Sort:
Omed

So as a little side project i want to create a website/script that would find out all the players you have played that have been banned for:

fair_play_violations

but the problem is the insane amount of api requests.
First i would need to send a Request to find out all the monthly archives, then send a request to all those monthly archives. But now the problem is that i need to send a request to each of the user's opponents like

https://api.chess.com/pub/player/yama
to figure out if the opponent is banned.
Which can add up to thousands or even hundreds of thousands of api requests. So what is the rate limit on the api? Im thinking of sending 5-10 requests per second so is that a good amount or bad amount?

Martin_Stahl

https://www.chess.com/announcements/view/published-data-api#pubapi-general-rate-limits

Omed

It says You will get a 429 response if u do parralel requests so does that mean i can send 1 request every 0.2 seconds instead of 5 requests in parrelel every second?

WangSandy

ig there's another api error, a guy called @ShivaChess456 is banned but says they are still online. using api it also says FPV, but can somebody fix it?

LateToMate

If you wait until you receive the response to your first inquiry prior to sending the second inquiry (however long that takes), then you won't experience issues per the documentation linked above.

Having previously done what you're trying to do, you'll probably be able to check about 2.5 profiles per second on average. The actual rate may fluctuate depending on your bandwidth and server load. This can take quite a while to run.

If you often play the same people, you can save some time by removing duplicate usernames, identifying the cheaters, and then cross-referencing your games with that list.

stephen_33

I think it's generally accepted that if API requests are sent serially/in succession, then there won't be an issue of rate limiting. Once you start parallel requests there may well be a problem.

And @LateToMate makes a good point: "you can save some time by removing duplicate usernames"

To avoid wasting time with duplicate requests I cache things like player ratings, or in your case the status of players ("basic"/"premium"/"closed..."). That alone may help to speed up the process considerably.

Crick3t

I never had an issue with "too many requests" as I do it serially. (ok, there was a time when chess.com had server issues every day, but recently it is OK)
On the other hand it will be slow and I think you will need to do some sort of caching like @stephen_33 suggests.

As an example, for me checking around 200 users (clubs, status, etc.) takes around 3 minutes. So your "insane" amount of queries will take "insane" amount of time.

If it is a one-off, then you will have to wait, but if you want to run it for multiple users multiple times, then I would build a database where I store banned members. If an account is closed it does not worth checking again. 
Also cache non banned members and only check them after some time. If someone is not banned, lets hope they are not going to be banned in the next 2-3 days.