Today users experienced a problem connecting over IMAP on the fusion.mxrouting.net server. Upon initial observation, everything looked fine, there were no clear errors being written anywhere and the services were all online and responsive. Upon receiving a few reports, we restarted Dovecot to mitigate the issue while starting the investigation. Users immediately reported clear skies, but the digging had to begin.
It was found that inotify max_user_instances is far too low on AlmaLinux 9 for this kind of server environment. However, since that wasn't an issue faced on previous servers it wasn't one that we were looking for, and it wasn't going to reveal itself until the server had been in production for a bit. It also caused redis to fork bomb the server a bit, leading to over 5600 redis instances open, almost all of which were opened for only one user. We've increased this and we now expect zero issues.