Nextflow Basics
Nextflow is a workflow engine that lets you write scalable, reproducible pipelines. On ASU Supercomputers, Nextflow submits each process as a sbatch job to the scheduler.
Run the main Nextflow job with minimal resources and a long walltime. It acts as a lightweight controller that submits and monitors child jobs. Child jobs should be short and right-sized, which improves your queue priority under fairshare scheduling. If your workload is pure CPU-based, please consider using Phoenix instead of Sol, as Phoenix has more public CPU nodes and has been optimized for Nextflow workloads.
Main Nextflow Job Example on Phoenix
#!/bin/bash
#SBATCH --job-name=nextflow_main
#SBATCH --cpus-per-task=1
#SBATCH --mem=20G
#SBATCH --time=7-00:00:00
#SBATCH --partition=public
module load nextflow-25.04.6-gcc-14.2.0-en
module load openjdk-17.0.3_7-4s
# if running nf-core pipelines, set these download locations for future re-use
WORKDIR=//scratch/$USER/nf-example
cd ${WORKDIR}
export NXF_APPTAINER_CACHEDIR=${WORKDIR}/apptainer
export NXF_HOME=${WORKDIR}/.nextflow
export NXF_OPTS="-Xms4g -Xmx16g" # in case the java map memory runs out
# if running a nf-core pipeline, replace the main.nf to the pipeline name
nextflow run main.nf \\
-profile apptainer \\
-work-dir ${WORKDIR} \\
-with-report ${WORKDIR}/final_results/report.html \\
-with-timeline ${WORKDIR}/final_results/timeline.html \\
-c ${WORKDIR}/phx.config
Nextflow Config
Define executor and resource defaults in your nextflow.config so each child job requests only what it needs for a short duration:
profiles {
slurm {
process.executor = 'slurm'
process.queue = 'short'
process.time = '2h'
process.memory = '8 GB'
process.cpus = 4
}
}
Per-Process Overrides
Override resources for individual processes that need more or less:
process {
executor = 'slurm'
errorStrategy = {'retry'}
maxRetries = 3
withName: 'ALIGN' {
cpus = 8
memory = '16 GB'
time = '4h'
}
withName: 'FASTQC' {
cpus = 2
memory = '4 GB'
time = '1h'
}
}
Tips
-
Keep child jobs short. Shorter walltimes improve your queue priority under fairshare and get scheduled faster.
-
Request only the resources you need. Over-requesting memory or CPUs eats into your fairshare allocation for no benefit.
-
Use
nextflow run -resumeif your main job times out. Nextflow will pick up from cached results automatically. -
For advanced usage such as error handling, dynamic resource allocation, and more, see the Nextflow on HPC guide.
-
Nextflow involves HPC-specific configuration. If you need help getting started, contact the Research Computing team.
-
Please consider citing our paper if you employ Nextflow on Phoenix:
Mu, N. T., Dizon, W., Otero, G., & Battelle, T. (2025). Optimizing Nextflow-based software on shared HPC resources: A case study with make_lastz_chains [Conference paper]. US Research Software Engineering Conference 2025 (USRSE'25), Philadelphia, PA. Zenodo. https://doi.org/10.5281/zenodo.17118383