Message-Id: 201612150955
Time: 2016Q4
Affected: users of shared drives
Impact: temporary access problems
The shared Windows drives are a strongly demanded resources by our customers and thus an important service offered by the GWDG. With approximately 400 TB in total, the shared drives are by now mostly migrated into the StorNext environment. StorNext is a storage area network file system; data are stored in RAID systems at our computing centre and are accessible via multiple File-Servers in parallel. The management of data is carried out by the so-called meta-data controllers. While the environment is complex, it was built with redundancy in mind in order to increase availability.
Unfortunately within the last weeks we experienced several service failures:
- 24.11-25.11.2016: Service maintenance due to StorNext upgrade
- 28.11.2016: Unexpected failure due to rectification works on the previous upgrade.
- 09.12.2016: Fatal failure of one our RAID systems that affected data accessibility and service performance.
In addition, the service experienced various failures at irregular points in time because network shares and access control rights on the file servers were not accessible due to network connectivity flaws to the meta-data controllers. We also noticed problems when storing Excel-2016 files.
The following measures were engaged so far:
- Network shares now verified automatically at regular points in time.
- The shared drives of the faculties have been distributed on several file servers.
We are currently monitoring this service with great care and we are prepared to engage counteractive measures to improve on short notice. We apologize for the service failures and resulting drawbacks and consequences for all our users of this service.