Transitioning Jobs to be Batched
In a Slurm HPC cluster, batched jobs improve efficiency by letting the scheduler queue, prioritize, and allocate resources for optimal utilization, reducing idle time and ensuring fair sharing. Unlike interactive jobs, they run without continuous user input, making them ideal for large or long computations while maintaining predictable and policy-compliant operations.
JUPYTER WORKFLOWS
Step 1: Convert the Jupyter Notebook to a Python script
To convert a Jupyter notebook (e.g., example_notebook.ipynb
) to a Python script on Sol, you can request a short Jupyter session and click File -> Save and Export Notebook As ... -> Executable script
. After the script is downloaded to your local machine, you can upload it back to Sol by going to sol.asu.edu and using the Files tab.
Alternatively, you can convert the notebook to a Python script using the terminal by following these steps:
-
Open a command line interface on Sol by navigating to sol.asu.edu on your browser and selecting the "Sol Shell Access" option from the "System" menu option or by SSHing into Sol using the command
ssh <asurite>@sol.asu.edu
. -
Request a
lightwork
compute node by running the commandaux-interactive
. -
Load the Jupyter module by running the command
module load jupyter/latest
. -
Convert the Jupyter Notebook to a Python script by running the command:
jupyter nbconvert --to script example_notebook.ipynb
This will create a Python script named example_notebook.py
in the same directory. After this, you can exit the lightwork
compute node by running the command exit
.
Step 2: Prepare the Python script for sbatch submission
Review the generated example_notebook.py
script and make any necessary adjustments to ensure compatibility with non-interactive execution. A few common adjustments include:
- Removing or modifying any interactive elements (e.g., widgets, plots, etc.).
- Ensuring any file paths, inputs, or outputs are explicitly defined and relative to the working directory where the script will run.
- If the script includes plotting or graphical outputs, consider saving them to files (e.g., PNG, PDF) instead of displaying them interactively.
- Removing any cells that are not necessary for the sbatch job.
- Adding necessary imports and environment setup at the beginning of the script.
Step 3: Create an sbatch script
To submit the Python script as an batch job, you need to create an sbatch script that specifies the resources and the commands required to run the job. Below is an example sbatch script (submit_job.sbatch
). Note that you must activate the appropriate Mamba environment before running the script. All Jupyter kernels, public and user-created, have an associated Mamba environment.
#!/bin/bash
#SBATCH -N 1 # number of nodes
#SBATCH -c 8 # number of cores
#SBATCH -t 0-01:00:00 # time in d-hh:mm:ss
#SBATCH -p general # partition
#SBATCH -q public # QOS
#SBATCH -o slurm.\%j.out # file to save job's STDOUT (%j = JobId)
#SBATCH -e slurm.\%j.err # file to save job's STDERR (%j = JobId)
#SBATCH --mail-type=ALL # Send an e-mail when a job starts, stops, or fails
#SBATCH --mail-user="\%u@asu.edu"
#SBATCH --export=NONE # Purge the job-submitting shell environment
#Load Mamba
module load mamba/latest
#Activate environment
source activate myEnv
#Run the python script
python myscript.py
Step 4: Submit the sbatch job
- Use the
sbatch
command to submit the sbatch script:
sbatch submit_job.sbatch
- Monitor the job status using the
squeue
command:
squeue -u $USER
- Once the job completes, review the output and error logs
slurm.<jobid>.out
andslurm.<jobid>.err
, respectively.
Additional notes
- Make sure to save your work in the Jupyter Notebook before converting it to a Python script.
- Ensure that all necessary files and dependencies are available in the working directory where the sbatch job will run.
- You can test and troubleshoot the Python script using the
debug
QOS by runninginteractive -q debug -t 15
.
R WORKFLOWS
This guide walks you through the process of running an R script as an sbatch job on Sol. This is useful when you have a long running, computationally intensive R script that you want to run on Sol.
Step 1: Convert an R markdown file to an R script
If you already have an R script (.R
) file, you can skip this step and proceed to Step 2. If you are working with an R markdown file (.Rmd
), you will need to convert it to an R script (.R
) before you can run it as an sbatch job. You can do this by following these steps:
-
Open an RStudio session on sol.asu.edu or request a
lightwork
interactive session withaux-interactive
, load R withmodule load r-4.4.0-gcc-12.1.0
and start an R session withR
. -
Use the
knitr
package to convert the R markdown file to an R script:
library(knitr)
purl("example.Rmd", output = "example.R")
Step 2: Prepare the R script for sbatch submission
Review the generated example.R
script and make any necessary adjustments to ensure compatibility with non-interactive execution. A few common adjustments include:
- Removing or modifying any interactive elements (e.g.,
readline
,shiny
). - Ensuring any file paths, inputs, or outputs are explicitly defined and relative to the working directory where the script will run.
- If the script includes plotting or graphical outputs, consider saving them to files (e.g., PNG, PDF) instead of displaying them interactively.
- Removing any cells that are not necessary for the sbatch job.
- Adding necessary imports and environment setup at the beginning of the script.
Step 3: Create an sbatch script
To submit the R script as an sbatch job, you need to create an sbatch script that specifies the resources and the commands required to run the job. Below is an example sbatch script (submit_job.sbatch
). Note that you must load the appropriate R module on the script; interactive sessions use the latest version by default.
#!/bin/bash
#SBATCH -N 1 # number of nodes
#SBATCH -c 8 # number of cores
#SBATCH -t 0-01:00:00 # time in d-hh:mm:ss
#SBATCH -p general # partition
#SBATCH -q public # QOS
#SBATCH -o slurm.%j.out # file to save job's STDOUT (%j = JobId)
#SBATCH -e slurm.%j.err # file to save job's STDERR (%j = JobId)
#SBATCH --mail-type=ALL # Send an e-mail when a job starts, stops, or fails
#SBATCH --mail-user="%u@asu.edu"
#SBATCH --export=NONE # Purge the job-submitting shell environment
#Load R module
module load r-4.4.0-gcc-12.1.0
#Run the R script
Rscript example.R
Step 4: Submit the sbatch job
- Use the
sbatch
command to submit the sbatch script:
sbatch submit_job.sbatch
- Monitor the job status using the
squeue
command:
squeue -u $USER
- Once the job completes, review the output and error logs
slurm.<jobid>.out
andslurm.<jobid>.err
, respectively.
Additional notes
- Make sure to save your work in the R markdown before converting it to an R script.
- Ensure that all necessary files and dependencies are available in the working directory where the sbatch job will run.
- You can test and troubleshoot the R script using the
debug
QOS by runninginteractive -q debug -t 15
.