Website Growing Pains - An open discussion as to why we've had so many issues with our website recently.

Posted in Announcements
Unsubscribe from this topic
Hello everyone,

Today we're going to go over a few major issues that occurred in the last few weeks. This topic will be fairly short and just detail the issues and how we resolved them.

In all cases, all account data has remained safe and secure. This only affected the availability of our website.

15 July 2018 at around 8:40 AM (UTC-4)
For roughly two hours, users were greeted to our website's version of a blue screen of death: https://i.imgur.com/8GhYbWQ.png

This was completely unexpected and was only made aware once we had woken up that morning to dozens of notifications about our service being down. After investigating, we found that our database had run out of memory randomly and after researching why this had happened, we came to the conclusion that we needed to introduce more memory to the server.

This issue made our website, multiplayer, the update checker, documentation and Cards Against Lucas unavailable. We're extremely sorry about this.

In response to this issue:
  • We introduced a new process on our servers to automatically restart our database software if anything goes wrong. It will also publicly log and notify us when this happens so we can investigate and be aware quicker.
  • We increased the amount of memory our server is able to use.
  • We fixed our status page to handle the entire database being down.



18 July 2018 at around 4:39 PM (UTC-4)
I'd like to explain why our forum was unavailable from 4:39 PM (UTC-4) on 18 July 2018 to 1:05 AM (UTC-4) on 19 July 2018. This was completely unexpected and was done on purpose because a problem was discovered by a new user. The last 30 or so members to register had trouble confirming their account, which was later made aware to us.

The issue that we discovered was concerning enough to us that we felt the need to take down the website temporarily to work on the issue. This issue also had to do with a third-party provider, as such we had to work with them to solve the issue.

Affected users can request a new code using the verification page linked at the top of the page.

In response to this issue:
  • We worked with our third-party provider to identify and resolve the issue. This took longer than anticipated because of misinformation provided by them. In the end we were able to bring back the site back up.
  • We fixed an issue with our website where users weren't able to modify their account settings before they're verified.
  • We added in safety checks to our code.
  • We prevent users from requesting more confirmation emails if they already have done once within the last 2 hours.
  • We're now discussing how to handle users who have not confirmed their accounts after a certain amount of time.



Our current website is built on code we created between 5 and 8 years ago, which has proven why we've had difficulties supporting the thousands of people that use this site every week. We're working on creating a new website, however supporting both websites simultaneously has proven difficult which is why it's taking longer than anticipated.

We know we need to do better and try to keep our service stable, but we hope you understand why we've had these downtimes.

Sincerely,
Jake Andreoli
i hope it fixed the webside
I feel that as much as there have been issues with the website, the way you've been handling them is awesome! Also considering you're basically a one man army because you're the only one who is scripting this website. Even going as far as trying to resolve the issues during down time at work, the dedication you have to this is very respectable. Also keeping in mind you've been working on this (as far as I know) since 2013. The site's growing pains are to be expected. But in the end it's not about how much the site has issues, it's how many times you're able to fix these issues. Now that I have my own job and life I fully understand how difficult it can be for you to juggle scripting the website, and every other thing related to this. Donut Team are regular guys doing this work in the little spare time they have, so in all honesty just having the website updated regularly is awesome! All in all it's not something to freak out over, just stuff that will be taken care of when time permits.

Just giving my two cents,

- Thomas
Unsubscribe from this topic
Please login to contribute to the conversation.