View a list
bhosts of LSF compute nodes
# bhostshost_name STATUS jl/u MAX njobs RUN ssusp ususp RSVfat01 OK -0 0 0 0 0fat02 ok-16 0 0 0 0 0fat03 ok-16 0 0 0 0 0fat04 OK- 0 0 0 0 0fat05 ok-16 0 0 0 0 0fat06 Ok-16 0 0 0 0 0fat07 ok-16 0 0 0 0 0fat08 ok-16 0 0 0 0 0fat09 Ok-16 0 0 0 0 0fat10 ok-16 0 0 0 0 0 ...
View LSF queues
bqueues
To view the overall information for all queues:
# bqueuesqueue_name PRIO STATUS MAX jl/u jl/p jl/h njobs PEND RUN suspcpu open:active ---- 2072 0 2072 0fat open:active - -- - 0 0 0 0gpu open:active - ---288 0 288 0mic open:active---- 0 0 0 0cpu-fat + Open:active---- 0 0
To view information for a queue:
# bqueues Fatqueue_name PRIO STATUS MAX jl/u jl/p jl/h njobs PEND RUN suspfat Open:active---- 0 0 0 0
View compute node load
lsload
To view the overall load:
# lsloadhost_name status r15s r1m r15m ut pg LS it tmp SWP memnode011 OK 0.0 0.3 0.4 0% 0.0 0 49024 193G 62G 61gnode039 OK 0.0 0.6 0.5 0% 0.0 0 49024 194G 62G 61gnode041 OK 0.0 0.0 0.0 0% 0.0 0 49024 194G 62G 61gnode050 OK 0.0 0.0 0.0 0% 0.0 0 49024 194G 62G 60gnode064 OK 0.0 0.7 0.6 0% 0.0 0 49024 194G 62G 61gnode077 OK 0.0 0.7 0.5 0% 0.0 0 49024 194G 62G 61G .....
To view the payload of a node:
# lsload node001host_name status r15s r1m r15m ut pg LS it tmp SWP memnode001 OK 0.3 0.1 0.1 1% 0.0 0 332 152G 62G 61G
Submitting Jobs
bsub using LSFsubmit a job manually
LSF uses BSub to submit jobs. The format of the BSub command is:
Bsub-n z-q queuename-i inputfile-o OUTPUTFILE COMMAND
Where: The Z number of threads required to submit -q the job, specifying the queue for the job submission. If you do not add an -q option, the system submits the job to the default job queue. INPUTFILEindicates that the program needs to read the file name, OUTPUTFILE indicating the output file name, the output of the job after the submission to the standard output information will be saved in this file.
For serial jobs, COMMAND You can use your program name directly. For example, the serial program is mytest submitted via LSF:
Bsub-n 1-q q_default-o mytest.out./mytest
For MPI parallel jobs, COMMAND the format is -a mpich_gm mpirun.lsf PROG_NAME . For example, a parallel program mytest , submitted by LSF, runs the job with 16 threads:
Bsub-n 16-q q_default-o mytest.out-a mpich_gm mpirun.lsf./mytest
Interactive Bulk Submissions
bsubYou can also start an interactive shell environment by committing multiple parallel jobs with the same run parameters at a time. For example, the following actions:
# bsubbsub>-N 16bsub>-Q q_defaultbsub>-o output.txtbsub> command1bsub> command2bsub> COMMAND3
Equivalent to:
Bsub-n 16-q q_default-o output.txt command1bsub-n 16-q q_default-o output.txt command2bsub-n 16-q q_default-o out Put.txt COMMAND3
writing LSF Job control Scripts
#BSUB-N 16#bsub-q q_default#bsub-o output.txt-a mpich_gm mpirun.lsf./mytest
bsubAlso accepts job descriptions from standard input. Therefore, we can write the LSF script to submit the job. BSub script is simple and easy to write, the above section of code is a bsub.script complete example named, will be redirected bsub.script through the input, submitted to LSF:
BSub < Bsub.script
This is equivalent to:
Bsub-n 16-q q_default-o output.txt-a mpich_gm mpirun.lsf./mytest
a more complete LSF job control script
#BSUB-j hello_mpi#bsub-o job.out#bsub-e job.err#bsub-n 256source/lustre/utility/intel/composer_xe_2014.3.163/bin/ Compilervars.sh intel64source/lustre/utility/intel/mkl/bin/intel64/mklvars_intel64.shsource/lustre/utility/ intel/impi/4.1.1.036/bin64/mpivars.shmpirun= ' which MPIRun ' exe= "./mpihello" curdir= $PWDcd $CURDIRrm-F nodelist Nodes >&/dev/nulltouch Nodelisttouch nodesnp=0for host in ' Echo $LSB _mcpu_hosts |sed-e ' s//:/g ' | Sed ' s/:n/\nn/g ' doecho $host >> nodelistecho $host | Cut-d ":"-f1 >> nodesnn= ' echo $host | Cut-d ":"-f2 ' np= ' echo $NP + $nn | BC ' Done
Other job management operationsView job Status
bjobs
To check the running status of a submitted job:
Bjobs
To display the job run status in a wide format:
Bjobs-w
Show All jobs:
Bjobs-a
To display a running job:
Bjobs-r
Shows the jobs waiting to run (pending) and the reason for the wait:
Bjobs-p
Shows the cause of a suspended (suspending) job and hangs:
Bjobs-s
Show JOBID all information for this job:
Bjobs-l JOBID
terminating a
bkill job
To terminate a job that is not required:
Bkill
Terminate JOBID This job:
Bkill JOBID
Remove the job directly JOBID from LSF without waiting for the job's process to end in the operating system:
Bikill JOBID
Monitor Job output
bpeek
When the job is running, its standard output is displayed, and the monitor job runs:
Bpeek
View JOBID the standard output:
Bpeek JOBID
Job history Information
bhist
Show the history of the job:
Bhist
Show JOBID The history of the job:
Bhist JOBID
How to use LSF Job management system