About specification of batch job scheduler

TSUBAME 3 uses the batch job scheduler

Resource Type

 

There are six available resource types as follows. Specify the resource type with "-l" option. (The "-pe" and "-q" options are not available.)

  Resource Type Name No. of used CPU core memory (GB) No. of GPU
F f_node 28 235 4
H h_node 14 120 2
Q q_node 7 60 1
C1 s_core 1 7.5 0
C4 q_core 4 30 0
G1 s_gpu 2 15 1


Job submission method

Job can be submitted from the login node with the following command.
 - Submission by job script (when user belonging to GSICGROUP executes train.sh)

qsub -g GSICGROUP train.sh

 

 - When executing an interactive job (when a user belonging to GSICGROUP uses s_core under X environment for 2 hours)

qrsh -g GSICGROUP -l s_core=1,h_rt=2:: -pty yes -display $DISPLAY -v TERM /bin/bash


For details of how to input jobs, such as how to specify the resource type by submission by job script, please refer to the usage guide.
User's Guide 5.2. Job submission

Also, please check the related FAQ below for the items not explained here.
Related FAQ
How to use scratch area
About submission method of dependent job
How to transfer X with qrsh
 

About job limit

Please check "Various limit value list" about the current limit.
If the submitted job exceeds the per-user limit, it will be kept in wait state "qw" even though there are enough idle nodes in TSUBAME3.
Once the other jobs terminate and the job fits in the per-user limit, it becomes running state "r", if there's enough idel nodes.

 

About reservation

Reservation can be set in units of one hour and the node can be used 5 minutes before the reservation end time.
When submitting a job, it needs to be executed with the following command. AR ID can be confirmed on the portal.

$ qsub -g GSICGROUP –ar ARID YOURSCRIPTFILENAME

Since it is used up to 5 minutes before the reservation end time, you need to devise the -l option of the job script.
Example) Resource specification when reservation period is 2 days

#$ -l h_rt=47:55:00

"Reservation" does not apply to the above "Job limits", and has the "Reservation" restriction.
Please check "Various limit value list" about the current limit.

Please check the related FAQ below for coping with error.
Related FAQ
"qsub: Unknown option" error occurs when submitting the job, but I do not know which option is bad
The job status is "Eqw" and it is not executed.
The error when executing the qrsh command
Check the detail of an error message printed the log file