All systems operational

Billing Panel

100 % uptime
Apr 2023100% uptime
May 2023100% uptime
Jun 2023100% uptime
DirectAdmin Panel
100 % uptime
Apr 2023100% uptime
May 2023100% uptime
Jun 2023100% uptime

IMAP

100 % uptime
Apr 2023100% uptime
May 2023100% uptime
Jun 2023100% uptime

SMTP

100 % uptime
Apr 2023100% uptime
May 2023100% uptime
Jun 2023100% uptime
Webmail
100 % uptime
Apr 2023100% uptime
May 2023100% uptime
Jun 2023100% uptime

Support

100 % uptime
Apr 2023100% uptime
May 2023100% uptime
Jun 2023100% uptime

Notice history

Jun 2023

Banshee Server Issue
  • Resolved

    This incident has been resolved. Postmortem and email to customers will be a bit delayed while the details are worked out.

  • Identified

    We're still performing an rsync of the data, and it's intended that the next step takes much less time than this one. When this is over we'll detail in a postmortem report what went wrong, how we'll prevent it from happening to such a degree in the future, and we'll talk about compensation (the latter will be in an email to the relevant customers).

  • Identified

    The rsync of data to the new server is still ongoing. It's roughly 3/4 of the way done.

  • Identified

    The fastest path to resolution right now is for us to migrate the Banshee server to another system. This is unfortunately going to take a significant number of hours, and any attempt to tell you how many hours that would be would have to be a lie, it simply can't be known right now. It'll take no less than the time it takes, and we'll make every effort to make that number as small as possible.

    The Banshee server is a legacy cPanel server, and this wasn't how we wanted to retire it's hardware. There is in fact no hardware issue, but for some reason the OS is hosed beyond reasonable repair. All data is intact, and all inbound emails should be held and received after this is complete.

  • Identified

    Now it just keeps booting to BIOS config and it won't do anything else. BIOS config is strangely lacking in the proper options that it should contain, options which might help to overcome this. So we're sending datacenter techs back out to see just what in the hell. Everything should be fine soon, no reason to suspect that this will be a long term outage.

  • Investigating

    The server goes offline almost immediately after coming back online. This server lacks an IPMI, so we are having a KVM attached. With that said, while booted into rescue mode all disks appear fine, this does not appear to be a hardware issue. We just can't get any strong insight into it without a KVM.

  • Investigating

    Problem reappeared. We are currently investigating this incident.

  • Resolved

    This incident has been resolved.

Apr 2023

Outage on wednesday.mxrouting.net server
  • Resolved

    This incident has been resolved.

  • Monitoring

    We are considering this "resolved." Everything is working but Crossbox, which we do not consider necessary to call the server "working" in production. We're still monitoring for any unforeseen issues.

  • Identified

    We're seeing more users with functional service than users without, and we expect this entire thing to be over in less than an hour, increasingly so as each minute passes (as the rsync finalizes each user's home directory one by one, in alphabetical order). Crossbox (mail.mxlogin.com) will indeed not be functional at the time that we declare this "resolved."

  • Identified

    Things are moving very smoothly at this stage. We see users sending out mail from the server without issue, so at least that is working right now with no hiccups. Inbound mail is still being queued, to be delivered when user data is copied. Crossbox (mail.mxlogin.com) will likely not work for some time after we consider this "resolved" for reasons not particularly interesting to anyone, but it will be a work in progress.

  • Identified

    DNS has been moved to the new server while the migration is underway. This does not indicate everything is working, but we should begin accepting emails into our queue and delivering them to users soon. Outbound mail should start working almost immediately, and "some" users should find their accounts usable to some degree or another. It shouldn't be considered odd to find anything/everything not working until the event is declared resolved.

  • Identified

    Most of the required parts are in place, but we need more user data to sync before we can swap DNS over.

  • Identified

    Still waiting on DirectAdmin to build all of it's services, as it seems to do this in the background after it claims installation is done now (strange?). In the meantime, the bulk of the data is being copied over so that we can hit the ground running, or as close to that as possible.

  • Identified

    New OS is ready to begin the process, we'll see how long it takes to finalize. The RAID is doing a sync right now so that might slow things down a bit.

  • Identified

    Hardware ready, OS installation in progress.

  • Identified

    The current server is ready for migration to the new system, the new system is a work in progress.

  • Identified

    Right now the work is split between two tasks. First is getting the new server online and ready to prep the OS for migration. Second is preparing the current production server for data transfer (booting into ISO, enabling networking, mounting disk).

  • Identified

    The issue with the operating system is beyond reasonable repair and seems to relate to the fact that we migrated an OS out of OVH's SoYouStart brand, which appears to have a unique image for Ubuntu 20. We are preparing a new server next to this one, which we will prep for deployment and begin migrating everyone's data. Because the server is close in proximity, we expect the data migration to move fairly quickly. We could be wrong on that.

  • Identified

    The operating system on the wednesday.mxrouting.net server is pretty well hosed, and we're working on a plan to get it back online. This isn't going to be a quick process.

  • Identified

    The server did not take well to being rebooted. We're looking at a kernel panic on boot right now. Best case scenario it's a bad kernel upgrade. Worse case scenario, the file system is hosed. Current estimate for repair is anywhere from 3 minutes to 3 days, so probably best not to draw any conclusions yet.

  • Identified

    The server is being an ass and won't let us into it, seems like maybe an attack against the SSH service. Quick reboot incoming.

Apr 2023 to Jun 2023