Shared Hosting Server Outage ( Various Servers )
Incident Report for Digital Pacific
Postmortem

Between 22:14 - 22:49 AEST 18/07/2020 we experienced a hardware fault on a firewall cluster which resulted in a 35 minute outage to our shared and reseller hosting infrastructure.

Our engineers were alerted immediately and troubleshooting began, it took some time to find the root cause.

The root cause was that the active firewall node at the time had a partial XFP transceiver failure where it was in a half-broken state which was not enough to trigger an automatic failover to the redundant node. Engineers performed a manual failover to the secondary firewall node and traffic was restored.

The faulty XFP in question was replaced and redundancy has been restored.

We will be reviewing our architecture and practises to see how we can avoid such an outage from occurring again.

We are sorry for any inconvenience caused.

Posted Jul 19, 2020 - 09:17 AEST

Resolved
This incident has been resolved.
Posted Jul 19, 2020 - 01:49 AEST
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Jul 18, 2020 - 23:07 AEST
Investigating
We are currently investigating an outage affecting multiple shared-hosting and re-seller servers.
Posted Jul 18, 2020 - 22:37 AEST
This incident affected: Personal and Business Hosting and Reseller Hosting.