Billing Panel - Operational
Billing Panel
DirectAdmin Panel - Operational
IMAP - Operational
IMAP
SMTP - Operational
SMTP
Webmail - Operational
Support - Operational
Support
Notice history
Jan 2024
- ResolvedResolved
Reboot complete.
- IdentifiedIdentified
We are currently rebooting the london.mxroute.com server to deal with what is technically a minor firewall issue, but is causing a fork bomb preventing us from working on the server to address a significant drop in performance.
- ResolvedResolved
This incident has been resolved.
- UpdateUpdate
The next 3 hours are within the time window that the data center has given us for "planned" (lie) maintenance.
- UpdateUpdate
The data center is calling this outage planned maintenance, not an outage like before. We should have been notified, they said. We weren't.
- IdentifiedIdentified
We have a ticket open with the data center.
- ResolvedResolved
This incident has been resolved. For now. Let's see if we can migrate the server before the data center employee goes to sleep again.
- UpdateUpdate
It's unclear how to provide a meaningful response right now that isn't simply berating our Australian provider. The 171 customers on the aus.mxroute.com server are offline. There is no ETA. There is no response from anyone at the data center. We will be severing ties with that company once back online, if they can manage to keep us online long enough to do so.
There is no point in making any further updates until we have one to provide.
- UpdateUpdate
We’re reviewing our options for removing our services from Australia. The amount of outages we’ve had in the region combined with how difficult the region is to sell to anyone outside of it, there is simply no further justification. When the Aus server is back online, we will discuss with our customers how we intend to move forward with their services in the region. Most likely this will be a migration out of the region.
- ResolvedResolved
This incident has been resolved.
- InvestigatingInvestigating
Currently 171 customers on the aus.mxroute.com server are offline. We have no information as to why it is down or how long it will be, as Australia is far outside of our physical reach and the datacenter staff is not very vocal.
Dec 2023
- ResolvedResolved
The Lucy server is considered to have been restored. Here are some key points:
While we're working to recover what we can of 1 week of email that was lost from restoring this backup, we are considering the data gone. If asked for it, as of right now, our answer would be that the data is lost. If we have any success in this, it will not be quick.
We're not saying everything is working flawlessly, but we're not going to update the status page for individual spot fixes. The larger picture here is that the restore is done, and that everything else which needs to be done is per user and not something that needs to be fixed for the whole server collectively.
Any further issues with the server should be in a support ticket. Going through them may not be a quick process. Requests to restore the last week of data will be referred back to #1 above.
- UpdateUpdate
Things on my plate for the Lucy server today:
- Ensure the last of the restores finish
- Double check consistency of restores. At least 1 restore we know of finished with a bunch of missing email accounts. It may have been a one-time problem, I won't know until I dig further.
- See about getting back email forwarders that JB didn't restore
- Check if sieve filters restored properly (why shouldn't they have?)
- UpdateUpdate
We're on the tail end of backup restores. We've restored 2563 of 2862 accounts. Here are some quick bullet points that might save you time and questions:
At least 1 account was identified to have not properly restored. All of it's email accounts were missing. How many are like that? I hope 1. But answering that question and making it right is one of the line items in front of me.
Customers are still reporting varying results with custom hostname SSL certs (mail/webmail/mailadmin subdomains). If you can fix this in DirectAdmin from your side, please do. Our attempts at fixing these while doing restores has resulted in creating temporary problems that overshadow this, we need to avoid further touching that from root until later. If our IP is rate limited by LetsEncrypt, I don't think we can fix that right now (their form says not eligible for increase).
JB (software) doesn't appear to be restoring a single email forwarder. Not revisiting this until after restores finish.
I (Jarland) am experiencing the worst kind of empathy. My inability to give immediate satisfaction to anyone who is currently begging me to fix something is upsetting me. It's important that I interact with customers sparingly right now, I notice that overdoing it is impacting my performance on restore/repair.
Most of our users on the Lucy server are online and in wonderful condition. This isn't for your benefit as much as mine, sometimes I need to think about what is working as opposed to what isn't.
- UpdateUpdate
We are continuing to finish the restores, but most customers are online. A couple of notes you might have missed from previous updates:
This is the backup we restored to the previous Lucy server, the one that failed. Due to extremely heavy usage from remaining repair efforts and resellers kicking off backups, we did not get a chance to backup that server before it failed. Expect a week of email to be missing. We're trying to get it back, but we want you to expect it to be gone. Hope is not warranted in that effort.
A lot of users are reporting custom SSL hostname issues. If you can fix it yourself in DirectAdmin that's great, but we're going to stop trying to fix them because we can't keep forcing additional Apache config reloads while the restores are doing the same, it terrorizes all of the users that have services online right now.
- UpdateUpdate
Clear skies for most users on the lucy.mxrouting.net server. The remainder of restores are still ongoing, most issues preventing users from doing what they needed to do have been fixed.
- UpdateUpdate
Restores are going quite well and they will definitely be complete today. Today we'll fix association between user and resellers, for users that do not appear connected to their resellers.
- UpdateUpdate
Backup restores are going great and picking up speed. Most custom webmail subdomain (non-Crossbox) SSL certs are restored for the accounts that have been restored.
- UpdateUpdate
Backup restores on lucy.mxrouting.net slowed down overnight due to an API failure in the restore of reseller accounts, causing failure to assign an IP to the reseller which then caused a failure to restore their user accounts, in the cases where the user account restores attempted after their reseller had been restored. We've corrected this and we're plowing through the restore lists again. Restore speeds should be faster now as even the ones that failed, more than half of the job (the compression of their data into a backup archive) had already finished.
- UpdateUpdate
Inbound email has been re-enabled on Lucy.
- UpdateUpdate
Here's the current state of the lucy.mxrouting.net server:
We've restored 1,000 accounts as of this update. Most likely, all restores will be complete by the end of Monday (US/Central). Users can start using their accounts as they are restored. You can check if yours has been by searching your DA username here: https://gw.mxroute.com/lucy.php
Inbound email will open back up at 11:00PM tonight, US/Central time.
Your reseller DA account having been restored does not mean that your users have all been restored. We have backups broken up into 12 batches, each batch is restoring independently.
Because some users may be restored prior to their reseller being restored, it's possible that some of your sub-users may not appear in your list in DA even if they have been restored. We'll fix that after restores, doing it while we're restoring removes our ability to take fully informed bulk actions and risks mistakes. It shouldn't stop your users from using their service, if their DA account has been restored.
If your custom webmail/mail/mailadmin URLs do not have working SSL, you may need to reapply the steps for it. We corrected what caused this after restoring about 800 backups, and it's another situation where we don't want to take bulk action to fix it while restores are still happening, as it makes results unpredictable.
The backup being restored is the last JetBackup copy of Lucy prior to the previous outage. The last week of email will not be included. We do hope to recover that from the previous Lucy server, but would prefer that you consider the data lost. Let it be a happy surprise if we recover it, but we don't think we will. It sucks, bad.
The "new/old backup plan" we talked about after restoring Lucy was going to start it's first round several hours after this outage. With so many users running their own backups, and with us still fixing some things, we didn't want to bring the server down by hammering the disks even harder when we were already at 25%+ iowait. Right as we were settling to a new normal and would have been able to start the backups, we got slapped down.
The good news is, if we have any worse luck none of us will care about this because it probably means nuclear war.
- UpdateUpdate
State of lucy.mxrouting.net server:
Still restoring backups.
Not accepting new inbound mail, but will be in a few hours. This is to ensure that email out there waiting to retry delivery to you has the best chance at being received properly.
You can see if your DirectAdmin user has been restored by inputting it's username here: https://gw.mxroute.com/lucy.php
A reseller user having been restored doesn't mean their users were all restored yet. Users restored prior to the restore of their reseller may not be correctly linked, we'll fix that if it ends up being the case.
Working on custom webmail subdomain SSL.
- UpdateUpdate
Backup restores going strong. The speed of the backups cannot be calculated, don't take this report as an invitation to do the math, you will be wrong on it, but we're at 913 backups restored right now on the Lucy server.
Inbound email will be re-enabled tonight. Remember that you can check your DirectAdmin username here to see if you've been restored: https://gw.mxroute.com/lucy.php
If a reseller account has been restored, that doesn't mean that their sub-users have been restored. It's also possible that we need to re-parse the list of owned users when this is done to ensure that resellers see all of their users listed in DirectAdmin when they go looking for them, but that's not a problem to be addressed during this stage of the repair.
- UpdateUpdate
Crossbox is reinstalled on the Lucy server. Of the 2862 backups to restore, 797 have been restored. Still going. These backup restores are more complete than the restore on Lucy2. Also, we're referring to the servers now as these:
Lucy1 - Failed RAID controller, recovered OS, survived until chassis swap and then file system was hosed.
Lucy2 - Failed RAID, reason not 100% proven but suspected as 1 bad sync + 1 bad disk
Lucy3 - The one in production right now seeing backups restored to it, not accepting inbound email until backups finish restoring (to preserve inbound email sitting in retry queues)
If we require a Lucy4, we're retiring the name Lucy and apologizing to whatever god we angered.
- UpdateUpdate
The lucy.mxrouting.net server has a new IP for the moment, which is 94.130.135.140. If your DNS is using CNAME records to point to lucy.mxrouting.net, you DO NOT need to make ANY changes to your DNS. This is only for users who are well aware that they created A records for this, which is fine but not something we suggest or directly approve of.
While you may see lucy.mxrouting.net online, we absolutely cannot accept new email into this server until more accounts have been restored. Doing so would mean the loss of all inbound email from the last 24 hours for users that have not yet been restored. You can check if your account has been restored by typing your DirectAdmin username into this form: https://gw.mxroute.com/lucy.php (Note that your reseller username being restored is not an indication that your sub-users have been restored, each DA username is considered it's own for this check).
As soon as we've restored the accounts, we'll start accepting inbound email.
- UpdateUpdate
We're increasing the number of simultaneous backup restores on the new Lucy server. You can check if your account has been restored yet by inputting your DirectAdmin username here: https://gw.mxroute.com/lucy.php
That check will come in handy a bit later, for now it's not very useful.
- UpdateUpdate
Backups are being restored on the server which was initially part of our Plan B from the last outage. This will mean a change of IP for the lucy.mxrouting.net server. An email will go out about that when it's time.
- UpdateUpdate
While it cannot be conclusively proven, we believe that this is what happened to the RAID10 array on the Lucy server:
- One of the drives took too long to sync, and was kicked out of the RAID.
- Another drive failed, leaving us with 2 drives in a RAID10 array.
We had not started monitoring the RAID or even started our new backup strategy with the server yet. We were still spot fixing issues from the previous restore, from customer tickets. While we are still working to see if we can recover any data from this server, we are moving forward with restoring a new server to the state that Lucy was previously restored to, after the most recent outage. That means that the server will be missing data which was written to the server after that last restore. Although we'd like to get that data back, we can't wait on that, we'll have to try to do that after.
Restoration is now in progress.
- UpdateUpdate
Disk cloning as an attempt to make the RAID controller appreciate it's drives like it's supposed to is the current path. There may be no reason to update here until that reaches another stage. These are 8TB disks we're cloning, just 2 of 4 right now.
- UpdateUpdate
We're cloning disks and replacing them as a troubleshooting step.
- UpdateUpdate
We're still watching the RAID array rebuild on the 4th disk.
- UpdateUpdate
Rebuilding RAID array before going any further. Won’t be any update for a bit while we let that run.
- UpdateUpdate
We are continuing to work on a fix for this incident.
- UpdateUpdate
We're still working on this.
- UpdateUpdate
Our hardware expert is working on the server, hope to have it back up in just a bit.
- UpdateUpdate
Reseating the drives wasn't enough to help, so looking into cables attached to the RAID controller. This may take a bit more time. It's surely nothing deeper than that, but we can't just send any old remote hands in to do that task without having the hardware master available. We're waiting on confirmation from our hardware master that he's taking control of that effort.
- IdentifiedIdentified
The issue with the Lucy server resembles that of some of it's disks being removed from the server. This is a hardware RAID10 configuration, there shouldn't be any problem here which is fatal. That a completely different server which shared the same name as this one experienced a storage related failure recently is a coincidence worth writing a book about, but there can be absolutely no way that the events are connected.
We're working on it, but we want to consult someone more senior on issues of this nature and be careful how we proceed, especially given what users on this server have recently been through.
- InvestigatingInvestigating
We are currently investigating this incident. The server was found to be online but unresponsive by several key services. Permission errors were visible on the IPMI console, and a reboot lead us to an interesting puzzle. We're working on this. There should be no relation to any previous events on this server.
Nov 2023
- CompletedDecember 05, 2023 at 1:35 AMCompletedDecember 05, 2023 at 1:35 AM
We are finally considering this to be fully resolved. There should be no abnormalities or parts that are not working as intended.
- UpdateNovember 28, 2023 at 9:26 PMIn progressNovember 28, 2023 at 9:26 PM
Migrating Crossbox now for the friday.mxlogin.com server, as we prepare to close this maintenance today.
- UpdateNovember 28, 2023 at 3:43 PMIn progressNovember 28, 2023 at 3:43 PM
Converting mdbox to maildir on accounts on the friday.mxlogin.com server to enable better troubleshooting for users still experiencing issues.
- UpdateNovember 28, 2023 at 3:28 AMIn progressNovember 28, 2023 at 3:28 AM
The expectation is that all users on the friday.mxlogin.com server are in a relatively good spot, with a few exceptions:
Some custom mail/webmail subdomains lack SSL certificates. Why didn't these copy over from the JetBackup clone? Great question. Working on it. Most are fixed already. (Expected resolved as of Nov 28 12:00AM US/Central)
MySQL migration is taking much longer than expected but that just means no Crossbox until we're finished with that, and users on this server haven't had Crossbox for quite some time now so at least it's not a new development.
There is a small amount of email data on the old server that isn't on the new, and that's clearing up over time right now. We're not getting any more complaints about this, it's unclear if anything is missing that any users even notice. Users missing anything vital, please open a support ticket so we can rush the final sync of your account from the old server (as our last sync is around 1-2 days old depending on account position). (Expected almost entirely resolved as of Nov 28 12:00AM US/Central)
If you are having trouble with webmail, use webmail.mxroute.com.
If your cPanel password needs to be reset, the process is virtually identical to our DA servers: https://mxroutedocs.com/directadmin/resetpass/
Overall we've learned that migration cPanel servers is no longer as simple as it used to be. It wasn't long ago that cPanel servers migrated simply, and our DirectAdmin servers were the ones that we had trouble migrating smoothly. The two seem to have switched position on us, some of which is quite likely our fault (ex. some creative symlinking used to fix a previous issue). We're still plugging away at this and expect everything to be done soon.
- UpdateNovember 27, 2023 at 8:08 PMIn progressNovember 27, 2023 at 8:08 PM
If no other webmail is functioning for users on the friday.mxlogin.com server during the finalization of this maintenance, please try webmail.mxroute.com. We are continuing to finalize this and expect to be done in a few minutes.
- UpdateNovember 27, 2023 at 7:58 PMIn progressNovember 27, 2023 at 7:58 PM
DNS swap for friday.mxlogin.com has just occurred. Working to fix SSL and migrate Crossbox over to get it re-enabled. May be some MySQL errors in this process as well.
- UpdateNovember 19, 2023 at 5:11 AMIn progressNovember 19, 2023 at 5:11 AM
The end of this maintenance is in sight, but it's still a little unclear how quickly we're moving toward that end. We'll know more as it progresses. Some things remain true at this stage:
Crossbox (mail.mxlogin.com) is offline for users of the friday.mxlogin.com server (ONLY that server). Use webmail.mxroute.com if needed.
Some users will see IMAP or SMTP errors intermittently which appear to resolve very quickly. The number of users experiencing this are a fraction of what they were when this maintenance began, and we'll be at zero soon enough.
As this isn't an "outage" we've been working to make sure the resolution isn't worse than the problem, and we're almost there.
- UpdateNovember 16, 2023 at 4:42 PMIn progressNovember 16, 2023 at 4:42 PM
We're extending the maintenance period again, as we're still perfecting this fix. We continue to mitigate the issues. Only a handful of customers on the friday.mxlogin.com server should continue to see occasional IMAP errors at this time.
- UpdateNovember 13, 2023 at 6:14 PMIn progressNovember 13, 2023 at 6:14 PM
The initial clone of the server has completed. Next we're going to sync that with the latest data from the primary, then switch DNS to the secondary, then run another sync to pick up anything left from the primary. No ETA.
- UpdateNovember 09, 2023 at 10:00 AMIn progressNovember 09, 2023 at 10:00 AM
As this long maintenance cycle continues, there may be some improvements to mitigation. Let's restate the issues as we currently see them:
Crossbox (mail.mxlogin.com) is intentionally down and will remain down until the end of this maintenance. ONLY for users on the friday.mxlogin.com server. NOT users of ANY other server. You have many ways to check your email but if you want a quick recommendation, try webmail.mxroute.com.
IMAP errors will occur at random for the more highly active users on the server. The time to resolution after each error should now be reduced to almost instant.
Some users will see delayed inbound emails, though this delay should be almost unnoticeable as of right now.
The hope is that with changes to mitigation this morning, the issues will be even more minor as we continue to work toward the total resolution. This is still not an outage, and no outage of this server is planned. This is intermittent minor issues which may annoy a fair portion of users on the server until resolved.
- UpdateNovember 07, 2023 at 5:30 PMIn progressNovember 07, 2023 at 5:30 PM
Everything is still going according to plan on the migration of the friday.mxlogin.com server. Due to extremely unique conditions there will continue to be intermittent IMAP and SMTP errors until this process is complete, and we'll keep trying to minimize them. Most customers will not notice any issues during this time, but our more active users will. There is no short term relief for those users. Crossbox (mail.mxlogin.com) will remain down for users on this server, users will be encouraged to use their own mail clients (as most already are), or use webmail.mxroute.com in the meantime.
If you are not certain whether or not this message impacts your service, it almost certainly does not.
- In progressNovember 06, 2023 at 2:31 AMIn progressNovember 06, 2023 at 2:31 AM
This migration does not require downtime of the server, though it is being associated with downtime of one of our webmail options on the server (Crossbox - mail.mxlogin.com). You can continue to use other webmail options, as well as webmail.mxroute.com, to access your email. You can continue using all third party email clients without issue. There may be occasional errors from this server, as this is a symptom of why this maintenance is taking place.
- PlannedNovember 03, 2023 at 8:36 PMPlannedNovember 03, 2023 at 8:36 PM
Intermittent issues with IMAP and SMTP can only be resolved on the friday.mxlogin.com server by migration to new hardware. It sounds strange, but it's true. We're in the process of doing this in production with the aim of not causing any downtime. It is expected that this could take the entire weekend as a result. Crossbox (mail.mxlogin.com) may be inaccessible for much, it not all, of this process for users on this server.