Back to overview
Downtime

Sluggish response times

Feb 27 at 08:08am MST
Affected services
mstdn.ca
elk.mstdn.ca
PostgreSQL

Updated
Mar 21 at 04:32pm MDT

Our second physical server arrived in Edmonton today and we took delivery. We will be setting up the server and performing a "burn-in" process to validate the hardware and expect to commission it next week for production.

At that time, the replicated database will be moved to this server, at which time the web service will be tested against this replicated server.

NEW: We're receiving reports of intermittent issues with Elk, but will have that resolved over the weekend.

Updated
Mar 10 at 02:17am MDT

We ran a validation and found that we're still a few hundred thousand rows out of sync with the origin database.

The database sync is still underway, pulling ~40Mbps as it keeps up with new data and pulls static data from the origin.

We'll continue to declare the database "degraded" but we haven't heard much for errors from our users.

Note that this synchronization may still take another week. We won't switch over until the entire database is secure, and that takes time.

Updated
Mar 09 at 01:09am MST

We are continuing to monitor the database migration. Often, these 400GB databases take weeks to fully synchronize. We're taking extreme caution with this as any data lost is completely unacceptable.

This shouldn't impact the instance, however we remain "degraded" as we have significant latency to the production cloud database.

Updated
Mar 02 at 10:15pm MST

Today we saw some errors, and out of an abundance of caution have postponed the database switch for 24 hours.

We will continue to update as updates are available.

Updated
Mar 02 at 04:09am MST

All metrics indicate that the database migration is in sync. We're going to watch this over the next day and if metrics continue to prove, we will attempt to switch over to the local database tomorrow night.

To prevent traffic spikes, we won't be sharing what time we're going to do this switch over, but it will be after our usual daily traffic demand. If we see abnormal traffic patterns tomorrow night, we will pause and wait 24 hours out of an abundance of caution.

Updated
Mar 01 at 04:10pm MST

UPDATE 4:10 pm MST: The database migration is still underway, still replicating the statuses table.

We've found that some requests are not getting through due to the web server threads being tied up waiting for the database to respond. As a result, we've deployed additional web threads to handle this.

Updated
Feb 27 at 11:33pm MST

UPDATE 11:33 PM MST: Elk and third party app functionality has been restored. Significant latency is still persisting, which should be resolved as the database continues to migrate to our local server.

Thanks for your patience and continued support.

Updated
Feb 27 at 09:08pm MST

UPDATE 9:08 PM MST: We have disabled third-party app access to the instance and noticed a marked difference in instance latency. The web UI is responding faster than before. For now, please use the Web UI to access the instance. We will update again when we know more.

Updated
Feb 27 at 07:00pm MST

UPDATE 7:00 PM MST: We believe that this is caused by slow database query times. We are working to move the nearly 400 GB database local to the instance web server.

Created
Feb 27 at 08:08am MST

We are aware of slow response times with the instance and are working to restore normal activity.

Elk may be down during this time.

Thank you for your patience.