This morning it was found that Dovecot on the arrow.mxrouting.net server was hitting process_limit. This revealed itself in a few different ways to customers, and not all customers saw the same thing. These are things some customers might have seen (though others might have seen any of these items as not occurring at the same time):
- Webmail outage
- Slow / errors in email client
Every time Dovecot updates we face this issue temporarily, but we have mitigation in place that automatically resolves it before customers notice. That mitigation failed to kick off this morning, causing the problem to continue until manually resolved. We’ve put in place a safeguard for now while working to determine why the mitigation didn’t kick off.