Failure: Network outages during maintenance (Software upgrades in routers for GÖNET and the data center at 06.10.2016 5:00 pm)

Message-Id: 201610061606 
Time: 06.10.2016, 6:08 – 06:13 pm
Affected: Data center router rz-sub1 and data center networks
Impact: Network disruption off all conncection to and inside data center networks during the above time.

Time: 06.10.2016, 7:13 Uhr – 7:20 pm
Affected: GÖNET router xr-physio1 and networks connected to this router
Impact: Network disruption off all conncections passing this router (Buildings south of the University Medical Center  and east of Goßlerstraße and Sportinstitut) during the above time.

During the upgrades of GÖNET and data center router (announced at short notice in the afternoon an necessary to install security patches) network outages occured.

Background:

In contrast to the promise of the routers vendor, that the upgrades would be non-disruptive (and as proved by us on a test system) network outages occered during the upgrade. Outages occured only for two routers because further upgrades were cancelled after this experience.

The upgrades on routers xr-physio1 and rz-sub1 caused a reboot of all “linecards” (the module, which provide the actual ports to the network). Due to these reboots both routers were effectively unavailable from a users perspective for 5 to 10 minutes.

The fault of rz-sub1 in addition and due to yet unclear reasons lead to disruptions at the second data center router (rz-gwdg1). In theory the data center network should continuosly operate without rz-sub1 without any disuption due to router / switch redundancy. But rz-gwdg1 responded in anexpected ways and disabled al connections to the data center networks at it’s side.

for further investigation of the reasons and clarification of the next steps for upgrading the remaining routers our service partner was already contacted.

During and partially even after the network outage services of the data center were not available.

We apologize for any inconvenience.