Transitioning Jobs to be Batched

In a Slurm HPC cluster, batched jobs improve efficiency by letting the scheduler queue, prioritize, and allocate resources for optimal utilization, reducing idle time and ensuring fair sharing. Unlike interactive jobs, they run without continuous user input, making them ideal for large or long computations while maintaining predictable and policy-compliant operations.

JUPYTER WORKFLOWS

Step 1: Convert the Jupyter Notebook to a Python script

To convert a Jupyter notebook (e.g., example_notebook.ipynb) to a Python script on Sol, you can request a short Jupyter session and click File -> Save and Export Notebook As ... -> Executable script. After the script is downloaded to your local machine, you can upload it back to Sol by going to sol.asu.edu and using the Files tab.

Alternatively, you can convert the notebook to a Python script using the terminal by following these steps:

Open a command line interface on Sol by navigating to sol.asu.edu on your browser and selecting the "Sol Shell Access" option from the "System" menu option or by SSHing into Sol using the command ssh <asurite>@sol.asu.edu.
Request a lightwork compute node by running the command aux-interactive.
Load the Jupyter module by running the command module load jupyter/latest.
Convert the Jupyter Notebook to a Python script by running the command:

jupyter nbconvert --to script example_notebook.ipynb

This will create a Python script named example_notebook.py in the same directory. After this, you can exit the lightwork compute node by running the command exit.

Step 2: Prepare the Python script for sbatch submission

Review the generated example_notebook.py script and make any necessary adjustments to ensure compatibility with non-interactive execution. A few common adjustments include:

Removing or modifying any interactive elements (e.g., widgets, plots, etc.).
Ensuring any file paths, inputs, or outputs are explicitly defined and relative to the working directory where the script will run.
If the script includes plotting or graphical outputs, consider saving them to files (e.g., PNG, PDF) instead of displaying them interactively.
Removing any cells that are not necessary for the sbatch job.
Adding necessary imports and environment setup at the beginning of the script.

Step 3: Create an sbatch script

To submit the Python script as an batch job, you need to create an sbatch script that specifies the resources and the commands required to run the job. Below is an example sbatch script (submit_job.sbatch). Note that you must activate the appropriate Mamba environment before running the script. All Jupyter kernels, public and user-created, have an associated Mamba environment.

#!/bin/bash

#SBATCH -N 1           # number of nodes
#SBATCH -c 8           # number of cores
#SBATCH -t 0-01:00:00  # time in d-hh:mm:ss
#SBATCH -p general     # partition
#SBATCH -q public      # QOS
#SBATCH -o slurm.\%j.out # file to save job's STDOUT (%j = JobId)
#SBATCH -e slurm.\%j.err # file to save job's STDERR (%j = JobId)
#SBATCH --mail-type=ALL # Send an e-mail when a job starts, stops, or fails
#SBATCH --mail-user="\%u@asu.edu"
#SBATCH --export=NONE   # Purge the job-submitting shell environment

#Load Mamba
module load mamba/latest

#Activate environment
source activate myEnv

#Run the python script
python myscript.py

Step 4: Submit the sbatch job

Use the sbatch command to submit the sbatch script:

sbatch submit_job.sbatch

Monitor the job status using the squeue command:

squeue -u $USER

Once the job completes, review the output and error logs slurm.<jobid>.out and slurm.<jobid>.err, respectively.

Additional notes

Make sure to save your work in the Jupyter Notebook before converting it to a Python script.
Ensure that all necessary files and dependencies are available in the working directory where the sbatch job will run.
You can test and troubleshoot the Python script using the debug QOS by running interactive -q debug -t 15.

R WORKFLOWS

This guide walks you through the process of running an R script as an sbatch job on Sol. This is useful when you have a long running, computationally intensive R script that you want to run on Sol.

Step 1: Convert an R markdown file to an R script

If you already have an R script (.R) file, you can skip this step and proceed to Step 2. If you are working with an R markdown file (.Rmd), you will need to convert it to an R script (.R) before you can run it as an sbatch job. You can do this by following these steps:

Open an RStudio session on sol.asu.edu or request a lightwork interactive session with aux-interactive, load R with module load r-4.4.0-gcc-12.1.0 and start an R session with R.
Use the knitr package to convert the R markdown file to an R script:

library(knitr)
purl("example.Rmd", output = "example.R")

Step 2: Prepare the R script for sbatch submission

Review the generated example.R script and make any necessary adjustments to ensure compatibility with non-interactive execution. A few common adjustments include:

Removing or modifying any interactive elements (e.g., readline, shiny).
Ensuring any file paths, inputs, or outputs are explicitly defined and relative to the working directory where the script will run.
If the script includes plotting or graphical outputs, consider saving them to files (e.g., PNG, PDF) instead of displaying them interactively.
Removing any cells that are not necessary for the sbatch job.
Adding necessary imports and environment setup at the beginning of the script.

Step 3: Create an sbatch script

To submit the R script as an sbatch job, you need to create an sbatch script that specifies the resources and the commands required to run the job. Below is an example sbatch script (submit_job.sbatch). Note that you must load the appropriate R module on the script; interactive sessions use the latest version by default.

#!/bin/bash

#SBATCH -N 1              # number of nodes
#SBATCH -c 8              # number of cores
#SBATCH -t 0-01:00:00     # time in d-hh:mm:ss
#SBATCH -p general        # partition
#SBATCH -q public         # QOS
#SBATCH -o slurm.%j.out   # file to save job's STDOUT (%j = JobId)
#SBATCH -e slurm.%j.err   # file to save job's STDERR (%j = JobId)
#SBATCH --mail-type=ALL   # Send an e-mail when a job starts, stops, or fails
#SBATCH --mail-user="%u@asu.edu"
#SBATCH --export=NONE     # Purge the job-submitting shell environment

#Load R module
module load r-4.4.0-gcc-12.1.0

#Run the R script
Rscript example.R

Step 4: Submit the sbatch job

Use the sbatch command to submit the sbatch script:

sbatch submit_job.sbatch

Monitor the job status using the squeue command:

squeue -u $USER

Once the job completes, review the output and error logs slurm.<jobid>.out and slurm.<jobid>.err, respectively.

Additional notes

Make sure to save your work in the R markdown before converting it to an R script.
Ensure that all necessary files and dependencies are available in the working directory where the sbatch job will run.
You can test and troubleshoot the R script using the debug QOS by running interactive -q debug -t 15.

JUPYTER WORKFLOWS​

Step 1: Convert the Jupyter Notebook to a Python script​

Step 2: Prepare the Python script for sbatch submission​

Step 3: Create an sbatch script​

Step 4: Submit the sbatch job​

Additional notes​

R WORKFLOWS​

Step 1: Convert an R markdown file to an R script​

Step 2: Prepare the R script for sbatch submission​

Step 3: Create an sbatch script​

Step 4: Submit the sbatch job​

Additional notes​

JUPYTER WORKFLOWS

Step 1: Convert the Jupyter Notebook to a Python script

Step 2: Prepare the Python script for sbatch submission

Step 3: Create an sbatch script

Step 4: Submit the sbatch job

Additional notes

R WORKFLOWS

Step 1: Convert an R markdown file to an R script

Step 2: Prepare the R script for sbatch submission

Step 3: Create an sbatch script

Step 4: Submit the sbatch job

Additional notes