The Unique Reason The Broken (Now Fixed) Chess.com App Made Headlines
NOTE: If Chess.com on your iPhone or iPad is still not working, please update to the latest version and make sure it is 3.6.14. Thank you and we're sorry!
Well, there's nothing like waking up to a dozen of your friends texting you saying "Wow! Congrats! You made Hacker News... sorry it's BAD news."
For those who don't know, Hacker News is one of the major sources of tech news on the interwebs and is closely followed by major news organizations to see what will bubble up to the top. (It's a bit like a tech-focused Reddit.) The post that hit Hacker News' homepage was this one: https://news.ycombinator.com/item?id=14539770, which included the headline:
Chess.com stopped working on 32bit iPads
because 2^31 games have been played
If you do the math, that means more than TWO BILLION live games have been played (2,147,483,647 to be exact). At that point, 32-bit operating systems cannot handle the number. Wikipedia explains, "The number 2,147,483,647 is the maximum positive value for a 32-bit signed binary integer in computing. It is therefore the maximum value for variables declared as integers (e.g., as int) in many programming languages, and the maximum possible score, money, etc. for many video games. The appearance of the number often reflects an error, overflow condition, or missing value." So, people on pre-2013 iPhones and iPads who were trying to play or observe games were receiving a game ID higher than that number—and erroring out.
All of this started happening late on Saturday, long before Hacker News posted. Early reports from members were difficult to understand. They were also impossible for us to reproduce since most of us are on newer 64-bit devices. (Doh!) We continued to get more and more reports over the next 24 hours. Unfortunately, late Sunday is the WORST time to get an issue like this because, hey, work starts MONDAY, right? It's hard to rouse people late at night on their last free moments of the weekend...
Anyway, super early Monday the iOS and Live Server teams were hard at work diagnosing the issue. After a short time...
Overflow, meaning, it couldn't handle the number!
We immediately set out to fix both the app and the server separately to see which fix would be faster. Both solutions were quickly cranked out, and then they had to go through rigorous QA to make sure that we didn't introduce new errors. After several hours of code, QA, and triple-checking, we submitted the fixed build to the Apple App Store along with an official "expedited request." To our great surprise, 1 hour and 59 minutes later we received...
Apple had approved the build and it started rolling out in record time. At that time, we decided NOT to update the server as that would require an extra restart, and the code that was introduced to address the bug was more of a hack than a proper fix (which needed to happen on the app, not server). Again, if Chess.com on your iPhone or iPad is still not working, please update to the latest version and make sure it is 3.6.14. Thank you and sorry!
The Hacker News story was later picked up by several other news outlets, and before we knew it, we were ignominiously mentioned on some major news sites.
Obviously this is an embarrassing bug for us. Likely the developer didn't expect two billion games at the time that code was written long ago. But on the other hand, two billion games!!!
It turns out we aren't the only ones to have such issues. Other "large numbers" failures include:
- A Rocket Ship: https://en.m.wikipedia.org/wiki/Cluster_(spacecraft)#Launch_failure
- PacMan: http://pacman.wikia.com/wiki/Map_256_Glitch
- YouTube: https://arstechnica.com/business/2014/12/gangnam-style-overflows-int_max-forces-youtube-to-go-64-bit/ (though it has been reported that this may have been a joke)
The moral of the story, I guess, is: When you have an idea, always plan BIG. That's why when I retire and open my donut shop, I'm going to have seating for THREE billion customers.
Thanks for your patience—and for doing your part to push us over two billion games. Wait, that means it's YOUR fault really, right?