Maintenance announcement for group disk/gs/hs1 (implemented on 4/25)

2018.4.24

A lot of Omni-Path network errors were detected in one (ossb 3) of the OSSs of the Luster file system that compose /gs/hs1. Although we have not confirmed the current situation, big speed reduction, etc., we will do this separating work of ossb 3 as follows for investigation and repair.


 

 

1. Date of implementation

Wednesday, April 25, 2018 (around 10:00-13:00) * The end time may be around.

2. Contents

Take over all OSTs mounted by faulty OSS (ossb3) to other normal OSS

3. Influence

If the data on /gs/hs1 you want to access is in the target OST, the access will be delayed (up to 30 minutes), but  I/O will be continued without timeout  as the Luster file system.

4. from now on

Since OST is biased towards a paired normal OSS, performance may be degraded.
After doing survey and recovery, we will announce another announcement and then return to the original configuration (take over OST under its original OSS).

Glossary

OST: In the Luster file system, a collection of disks actually storing the contents of the file.

OSS: In the Luster file system, a server that actually transmits and receives the contents of a file to a compute node.