This shows you the differences between two versions of the page.
biac:cluster:submit [2020/02/11 17:45] cmp12 [Job restrictions] |
biac:cluster:submit [2023/02/23 18:43] |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== Job submission ====== | ||
- | Jobs are submitted to the cluster with **qsub**. The most basic usage of qsub has the job script as its only argument. To submit a job, | ||
- | > qsub script.sh | ||
- | |||
- | Submitted jobs are identified by a job id (a unique id assigned by the cluster) and job name (defaults to the name of the script). The job id cannot be changed by the user. User's can set the job name to help distinguish multiple instances of the same script. | ||
- | |||
- | > qsub -N run_script.1 script.sh | ||
- | > qsub -N run_script.2 script.sh | ||
- | |||
- | Scripts can also accept arguments from the command line during submission. They can be accessed within the script as $1,$2 ... $n (n - number of arguments). To submit a job with arguments | ||
- | |||
- | > qsub -N run_script_all | ||
- | |||
- | To parallelize processing of the above script, | ||
- | |||
- | > qsub -N run_script.1 script.sh run1 | ||
- | > qsub -N run_script.2 script.sh run2 | ||
- | > qsub -N run_script.3 script.sh run3 | ||
- | |||
- | |||
- | ====== Job restrictions ==== | ||
- | == | ||
- | Each node has a finite amount of memory installed and due to the disk-less nature of the nodes there are restrictions set on the amount of ram used. Currently, the default is to assign 10G of ram per job that is submitted. | ||
- | |||
- | > qsub -N run_script.1 -l h_vmem=12G, | ||
- | |||
- | The above example will request/ | ||
- | |||
- | You can also get it from previous jobs if you have the job number with qacct ( there will be a resulting entry for " | ||
- | > qacct -j JOBNUM | ||
- | maxvmem | ||
- | | ||
- | Also, you can request the info from currently running jobs with qstat ( look for the " | ||
- | > qstat -j JOBNUM | ||
- | usage 1: | ||
- | | ||
- | The maximum available is ~750GB on any node, so if you request more than that, the job will just sit in the queue waiting indefinitely. | ||
- | |||
- | //Please do not request additional resources unless you absolutely need them. If additional resources are requested, they are deducted from the amount available to everyone else. If unneeded resources are requested, this reduces the capacity on a given node for other potential usage.// | ||
- | |||
- | There is a global limit on any single user of 60 slots and/or 1920G of ram. | ||
- | |||
- | There is a 6GB cumulative quota on all HOME directories | ||
- | ====== Job status ====== | ||
- | |||
- | The current statu | ||
- | s of a job can be checked with **qstat**. This will return the current list of jobs owned by the user. | ||
- | > qstat | ||
- | job-ID | ||
- | --------------------------------------------------------------------------------------------------------------- | ||
- | | ||
- | | ||
- | |||
- | Each job listing has the following relevant properties | ||
- | | job-id | Unique id assigned by the cluster. | | ||
- | | name | Name of the job. Default value is the name of the script submitted.| | ||
- | | user | Username of person who submitted the job. | | ||
- | | state | Current state of the job. This could be " | ||
- | | submit/ | ||
- | |queue | ||
- | |slots | Number of processors the job will use. | | ||
- | |||
- | When the job is completed, it will no longer appear in **qstat** listings. | ||
- | |||
- | The status of all jobs owned by users can be checked with **qstatall** | ||
- | > qstatall | ||
- | Running jobs: | ||
- | job-ID | ||
- | ----------------------------------------------------------------------------- | ||
- | 1294 1 script1.sh | ||
- | 1295 1 script2.sh | ||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | ====== Job delete ====== | ||
- | A submitted job can be deleted with **qdel**. It takes the job-id (listed by **qstat**) as its argument. | ||
- | |||
- | > qdel 9999 | ||
- | |||
- | |||
- | All jobs for a particular user can be deleted with the following command. | ||
- | |||
- | > qdel -U username | ||
- | ====== Template Script ====== | ||
- | Jobs are usually written in bash. They are similar to local bash scripts in syntax and usage. In addition, they contain cluster related directives identified by lines starting with " #$ ". These are used to send job related setup information to the cluster. Scripts also contain requests for access to experiment data. The BIAC template script is a good starting point for testing job submission and as a base script for all jobs. Begin, by making a copy of the template script. | ||
- | |||
- | cp / | ||
- | |||
- | The template script requests access to an experiment folder and lists its contents. It needs a valid BIAC Experiment Name (case-sensitive) that is accessible by the user. Submit myscript.sh using qsub . | ||
- | |||
- | qsub -v EXPERIMENT=Dummy.01 myscript.sh | ||
- | |||
- | Run **qstat** to check job status. The job will initially be in " | ||
- | |||
- | |||
- | The script is divided into multiple sections. The user sections are [[biac: | ||
- | |||
- | ==== USER DIRECTIVE ==== | ||
- | If you want mail notifications when your job is completed or fails you need to set the correct email address. Change the dummy email address (user@somewhere.edu) with the correct email address in the following line. | ||
- | |||
- | #$ -M user@somewhere.edu | ||
- | |||
- | |||
- | |||
- | ==== USER SCRIPT ==== | ||
- | * Add your script in this section. | ||
- | * Within this section you can access the requested experiment folder using <color navy> | ||
- | <code bash> | ||
- | # Correct - lists the contents of the experiment folder | ||
- | ls -l $EXPERIMENT | ||
- | |||
- | # Correct - lists the contents of the Analysis folder in your experiment directory | ||
- | ls -l $EXPERIMENT/ | ||
- | |||
- | # Incorrect - The output will be " My experiment name is / | ||
- | # instead of the desired " My experiment name is Dummy.01 " | ||
- | echo "My experiment name is $EXPERIMENT" | ||
- | </ | ||
- | |||
- | * All terminal output is routed to the " Analysis " folder under the Experiment directory i.e. <color navy> | ||
- | <code bash> | ||
- | OUTDIR=$EXPERIMENT/ | ||
- | </ | ||
- | * On successful completion the job will return 0. If you need to set another return code, set the <color navy> | ||
- | <code bash> | ||
- | RETURNCODE=110 | ||
- | </ | ||
- | * Arguments to the USER SCRIPT are accessible in the usual fashion eg: <color navy>$1 $2 $3</ | ||
- | |||
- | <code bash basic.sh> | ||
- | #!/bin/sh | ||
- | |||
- | # --- BEGIN GLOBAL DIRECTIVE -- | ||
- | #$ -S /bin/sh | ||
- | #$ -o $HOME/ | ||
- | #$ -e $HOME/ | ||
- | #$ -m ea | ||
- | # -- END GLOBAL DIRECTIVE -- | ||
- | |||
- | # -- BEGIN PRE-USER -- | ||
- | #Name of experiment whose data you want to access | ||
- | EXPERIMENT=${EXPERIMENT:?" | ||
- | |||
- | EXPERIMENT=`findexp $EXPERIMENT` | ||
- | EXPERIMENT=${EXPERIMENT:?" | ||
- | |||
- | if [ $EXPERIMENT = " | ||
- | then | ||
- | exit 32 | ||
- | else | ||
- | #Timestamp | ||
- | echo " | ||
- | # -- END PRE-USER -- | ||
- | # ********************************************************** | ||
- | |||
- | # -- BEGIN USER DIRECTIVE -- | ||
- | # Send notifications to the following address | ||
- | #$ -M user@school.edu | ||
- | |||
- | # -- END USER DIRECTIVE -- | ||
- | |||
- | # -- BEGIN USER SCRIPT -- | ||
- | # User script goes here | ||
- | |||
- | # List all files in the requested Experiment directory | ||
- | ls -l $EXPERIMENT | ||
- | |||
- | |||
- | |||
- | # -- END USER SCRIPT -- # | ||
- | |||
- | # ********************************************************** | ||
- | # -- BEGIN POST-USER -- | ||
- | echo " | ||
- | OUTDIR=${OUTDIR: | ||
- | mv $HOME/ | ||
- | RETURNCODE=${RETURNCODE: | ||
- | exit $RETURNCODE | ||
- | fi | ||
- | # -- END POST USER-- | ||
- | </ | ||
- | |||
- | ==== Notes ==== | ||
- | * if you ever edit your scripts on a non-unix machine, please run dos2unix on them before submitting | ||
- | * sometimes there are hidden window' |