Various limit value list

(last update 2024-01-22)

(Update on 2024-02-01: We lifted the temporal limitation due to node reduction caused by cooling unit trouble)

 List of various limit values in TSUBAME3. It may be changed due to other factors such as batch jobs getting stuck beyond estimation and tight electricity supply-demand balance. Regardless of the upper limit value, please cooperate in easing congestion when crowded.

Job Execution

 weekdays(*1)weekends(*2)
Number of jobs running simultaneously per user30 jobs100 jobs
Number of slots running simultaneously per user (number of CPU cores)2016 slots4032 slots
Maximum parallelism per job144(*3)144
Maximum execution time per job24 hours24 hours

 *1:weekdays: Jobs that start every Sunday between 9:00 and Friday 16:00
 *2:weekends: Jobs that start between Friday 16:00 and Sunday 9:00. However, holidays are not considered to simplify processing.
 *3:Although 144 parallelism is permitted, note that only 72 node parallelism is permitted in f_node case because of the 2016 slot limit.

Node Reservation

 April-SeptemberOctober-March(busy period)
Number of reservation providing nodes (whole)270 nodes135 nodes
1 Reservation maximum reservation time168 hours(7days)96 hours(4days)
Total number of reservation frames that one group can simultaneously secure12960 node hours6480 node hours

The above "Job Execution" limitations do not apply to "Node Reservation".

Login Nodes

As the login nodes (login, login0, login1) are shared with many users, please do not execute heavy workloads on them.
Please refrain from occupying the CPU in the login nodes.

Interactive job queue

  • Assigned resources number of physical CPU cores 7 cores, 60GB memory, 1GPU, but up to 7 people share the same resources.
    • If there's no suitable resource left, the job submission will fail like normal qrsh command.
    • Since the memory contents are swapped out to the SSD according to the congestion situation, the performance may be significantly reduced.
  • Number of jobs that can be executed simultaneously per user 1 job
  • Maximum usage time 24 hours
  • The local scratch area (SSD) is reserved and shared, and the available capacity cannot be guaranteed.
  • Execution by reservation, Docker container job cannot be used. Container jobs with Singularity are available.
  • The programs with intermittent processor usage, such as debuggers, visualizers, Jupyter Lab, are expected and do not use this service for programs that dominate processors continuously.
    • Depending on the program execution status, jobs that significantly hinder the execution of other users' programs may be deleted.
  • These limitations ​​may be revised without prior notice. Also, the service itself may be terminated or suspended without prior notice.