Various limit value list

(last update 2021.04.06)

 List of various limit values in TSUBAME3. It may be changed due to other factors such as batch jobs getting stuck beyond estimation and tight electricity supply-demand balance. Regardless of the upper limit value, please cooperate in easing congestion when crowded.

Job Execution

  weekdays(*1) weekends(*2)
Number of jobs running simultaneously per user 30 jobs 100 jobs
Number of slots running simultaneously per user (number of CPU cores) 2016 slots 4032 slots
Maximum parallelism per job 144(*3) 144
Maximum execution time per job 24 hours 24 hours

 *1:weekdays: Jobs that start every Sunday between 9:00 and Friday 16:00
 *2:weekends: Jobs that start between Friday 16:00 and Sunday 9:00. However, holidays are not considered to simplify processing.
 *3:Although 144 parallelism is permitted, note that only 72 node parallelism is permitted in f_node case because of the 2016 slot limit.

Reservation

  Current limit Reference: Set value in April-September
Number of reservation providing nodes (whole) 135 nodes 270 nodes
1 Reservation maximum reservation time 96 hours(4days) 168 hours(7days)
Total number of reservation frames that one group can simultaneously secure 6480 node hours 12960 node hours

The restrictions described in "Execute job" do not apply to "Reservation".

Login Nodes

As the login nodes (login, login0, login1) are shared with many users, please do not execute heavy workloads on them.
Please refrain from occupying the CPU in the login nodes.

Interactive job queue

  • Assigned resources number of physical CPU cores 7 cores, 60GB memory, 1GPU, but up to 7 people share the same resources.
    • If there's no suitable resource left, the job submission will fail like normal qrsh command.
    • Since the memory contents are swapped out to the SSD according to the congestion situation, the performance may be significantly reduced.
  • Number of jobs that can be executed simultaneously per user 1 job
  • Maximum usage time 24 hours
  • The local scratch area (SSD) is reserved and shared, and the available capacity cannot be guaranteed.
  • Execution by reservation, Docker container job cannot be used. Container jobs with Singularity are available.
  • The programs with intermittent processor usage, such as debuggers, visualizers, Jupyter Lab, are expected and do not use this service for programs that dominate processors continuously.
    • Depending on the program execution status, jobs that significantly hinder the execution of other users' programs may be deleted.
  • These limitations ​​may be revised without prior notice. Also, the service itself may be terminated or suspended without prior notice.