Support experiencing degraded performance

Outage on wednesday.mxrouting.net server

Resolved
Operational
Started about 1 year ago Lasted about 9 hours

Affected

DirectAdmin Panel
IMAP
SMTP
Webmail
Updates
  • Resolved
    Resolved

    This incident has been resolved.

  • Monitoring
    Monitoring

    We are considering this "resolved." Everything is working but Crossbox, which we do not consider necessary to call the server "working" in production. We're still monitoring for any unforeseen issues.

  • Identified
    Update

    We're seeing more users with functional service than users without, and we expect this entire thing to be over in less than an hour, increasingly so as each minute passes (as the rsync finalizes each user's home directory one by one, in alphabetical order). Crossbox (mail.mxlogin.com) will indeed not be functional at the time that we declare this "resolved."

  • Identified
    Update

    Things are moving very smoothly at this stage. We see users sending out mail from the server without issue, so at least that is working right now with no hiccups. Inbound mail is still being queued, to be delivered when user data is copied. Crossbox (mail.mxlogin.com) will likely not work for some time after we consider this "resolved" for reasons not particularly interesting to anyone, but it will be a work in progress.

  • Identified
    Update

    DNS has been moved to the new server while the migration is underway. This does not indicate everything is working, but we should begin accepting emails into our queue and delivering them to users soon. Outbound mail should start working almost immediately, and "some" users should find their accounts usable to some degree or another. It shouldn't be considered odd to find anything/everything not working until the event is declared resolved.

  • Identified
    Update

    Most of the required parts are in place, but we need more user data to sync before we can swap DNS over.

  • Identified
    Update

    Still waiting on DirectAdmin to build all of it's services, as it seems to do this in the background after it claims installation is done now (strange?). In the meantime, the bulk of the data is being copied over so that we can hit the ground running, or as close to that as possible.

  • Identified
    Update

    New OS is ready to begin the process, we'll see how long it takes to finalize. The RAID is doing a sync right now so that might slow things down a bit.

  • Identified
    Update

    Hardware ready, OS installation in progress.

  • Identified
    Update

    The current server is ready for migration to the new system, the new system is a work in progress.

  • Identified
    Update

    Right now the work is split between two tasks. First is getting the new server online and ready to prep the OS for migration. Second is preparing the current production server for data transfer (booting into ISO, enabling networking, mounting disk).

  • Identified
    Update

    The issue with the operating system is beyond reasonable repair and seems to relate to the fact that we migrated an OS out of OVH's SoYouStart brand, which appears to have a unique image for Ubuntu 20. We are preparing a new server next to this one, which we will prep for deployment and begin migrating everyone's data. Because the server is close in proximity, we expect the data migration to move fairly quickly. We could be wrong on that.

  • Identified
    Update

    The operating system on the wednesday.mxrouting.net server is pretty well hosed, and we're working on a plan to get it back online. This isn't going to be a quick process.

  • Identified
    Update

    The server did not take well to being rebooted. We're looking at a kernel panic on boot right now. Best case scenario it's a bad kernel upgrade. Worse case scenario, the file system is hosed. Current estimate for repair is anywhere from 3 minutes to 3 days, so probably best not to draw any conclusions yet.

  • Identified
    Identified

    The server is being an ass and won't let us into it, seems like maybe an attack against the SSH service. Quick reboot incoming.