Main differences between TSUBAME 2.5 and TSUBAME 3.0 ( node reservation )

This article describes the main difference between the TSUBAME2.5 and TSUBAME3.0.
Please refer to "TSUBAME portal User's Guide" for node reservation method and "TSUBAME 3.0 User's Guide" for job submission method to reserved node.

In addition, some setting and  limit values may be updated, taking into consideration the system usage situation.We will announce you when changing the setting, so please periodically check the latest notice.
 

Small reservation are easier take  as better than before.

In the H queue of TSUBAME 2.5, reservations could only be made with 16 node or more, 1 day unit and large scale execution, but in TSUBAME 3.0, it is possible to reserve one node or more per hour In addition to large-scale execution, it can be used for long-term execution, etc.

Reservation relationship limit value list of TSUBAME 3.0
Please check "Various limit value list" about the current limit.

Maximum number of reserved nodes: 135nodes (October-March), 270nodes (April-September)
Reservation time length: 1hour~96hours(4 days)(October-March), 1hour~168hours(7 days)(April-September)
Total number of reservation slots that one group can at once: 6480nodes-hours(October-March), 12960nodes-hours(April-September)

About the time that a job can actually be executed

In TSUBAME 2.5 we were able to occupy the node from 10 am on the reservation start date until 9 am on the reservation end date.
In TSUBAME 3.0, the node can be used from the reservation start time to 5 minutes before the reservation end time, and all jobs are stopped 5 minutes before the end time.

On submitting jobs to reserved nodes

By adding " -ar reservation number "to the arguments of qsub, qrsh etc., you can submit the job to the reserved node.(You can submit a job before the start time of reservation slots)
Please note that if you do not specify " -ar reservation number ", you will consume points and execute the job outside the reservation slots."
Even if you are using a resource type other than f_node, please be aware that you can not submit jobs with more than parallel number of reserved nodes.For example, h_node 40 parallelism can not be executed when 20 nodes are reserved. You can run two parallel jobs at the same time.

About SSH / direct login to reserved node

In TSUBAME 2.5, members of the TSUBAME group who made the reservation were able to execute SSH in order to calculate with the reserved node, and execute directly without going through the scheduler.
With TSUBAME 3.0, only users submitting jobs can perform SSH only when submitting f_node jobs.To execute the program directly, create a job with the required number of nodes with f_node, or log in with qrsh.

Attention on appointment just before the start time

For TSUBAME 2.5, the point consumption of the reservation starting within one week was constant.
In TSUBAME 3.0, Reservations to start within 24 hours is four times higher than Reservation for more than 24 hours (within 2 weeks).
This is to avoid affecting jobs other than the reservation that has already been submitted.
In addition, since the node used for reservation and the node used for jobs other than reservation are shared, there is a high possibility that reservations within 24 hours can not be secured depending on job execution status.
For large-scale execution, prepare in advance and recommend early reservation.

Note If you decide to cancel the reservation

In TSUBAME 2.5, TSUBAME points consumed when deletion of reservation was fully points returned, but TSUBAME 3.0 only returns TSUBAME points up to half price except for the following reasons when dealing reservation.

  • Case :Cancellation within 5 minutes after making reservation
  • Canceled without user's responsibility, such as system maintenance

Reserving a node makes it harder for other jobs to be executed in order to reserve a compute node at the reserved time.Please confirm the reservation contents carefully when reserving the node and reserve only the necessary amount.