Emergency router maintenance

Minor Incident Gen4 Gen3 Public Cloud
14 days, 14 hours, 18 minutes

Update

Resolved

We have resolved this issue.

The root cause was a larger than anticipated incoming BGP route table causing the router in question to run out of physical memory and start swapping. This problem has been resolved using filtering to prevent it happening again.

Feb 7th, 2020 14:32 UTC
Investigating

We had 2 periods between 03:00 and 03:40 of around 10 minutes each when our network was mostly unavailable.

At around 03:00 we had a major issue with this device and it stopped functioning correctly. We have put a workaround in place, and the last alert we saw for any customer environments relating to this was around 03:40.

We will need to perform further investigation both internally and with the manufacturer of the hardware to understand exactly what went wrong, which will hopefully be completed in the coming working days. This post will be updated with more information as and when we have it.

We’ve been monitoring customer impact throughout to ensure that any customer environments impacted by this event has recovered as it should, however if you are aware of an ongoing issue please call us on 01252 560565 or email support@wirehive.com.

My sincere apologies for this event and any impact it may have caused you.

Simon Green CTO

Jan 24th, 2020 04:23 UTC
Issue

We have seen some anomalous behaviour and unexpected log messages from one of our core routers and will be rebooting the device in order to clear them. We do not expect this to impact customers, but this should be considered an at-risk period.

Jan 24th, 2020 00:53 UTC