2018.6.25
An fault occurred and now temporarily recovered.
1. Summary
Impossible to access a part of /gs/hs0. It has temporarily recovered, but there is the possibility of performance decline.
2. Period
From 13:39 to 13:54, on Jun. 25
3. Details
Around 13:39, panic occurred on ossa2 which manages OST of Lustre (/gs/hs0), thereby It happened not to be able to access to /gs/hs0. Around 13:54, it was taken over to ossa3. /gs/hs0 is accessible at present.
It was probably caused by a temporal stall of file I/O to Lustre file system in the period above.
OST, which is supposed to be managed by ossa0, is mounted on ossa1 at present. For that reason, it is possible that I/O bandwidth to /gs/hs0 decline.
It is thought that of the same kind as the fault occurred on Jun. 15, May 24, and Jun. 4.
The cause of the failure is known and can be corrected, but we are carefully considering it because it will be a large stop.
Maintenance scheduled to take back to 6/27 is scheduled.