There is an error with the Proton backend. Our infrastructure team is currently investigating and we will report back when we have more information.
The issue has now been resolved. Some subsystems may still be impacted, so we are running thorough checks at this time. However, the root cause is understood and are expected to recover shortly if they haven't already. No data or emails were lost during this incident which lasted approximately 19 minutes.
The cause was a MySQL instance which became unresponsive, but continued to accept new connections. This led to the API continuing to launch new processes which would never complete or receive a response. Because many processes appeared unresponsive, the load balancers began to progressively mark API instances as offline until all instances were automatically taken offline. Our infrastructure team was able to recover the system by properly killing the unresponsive MySQL instance so that it would also stop accepting new connections.
An unresponsive MySQL instance which would continue to accept new connections was an edge case that was not foreseen when designing our API. As a long term fix, we will be modifying the ProtonAPI code to also properly handle this particular case.