Nextflow Basics
Nextflow is a workflow engine that lets you write scalable, reproducible pipelines. On ASU Supercomputers, Nextflow submits each process as a sbatch job to the scheduler.
Run the main Nextflow job with minimal resources and a long walltime. It acts as a lightweight controller that submits and monitors child jobs, which should be short and right-sized to work in your favor under fairshare scheduling.
Main Nextflow Job Example
#!/bin/bash
#SBATCH --job-name=nextflow_main
#SBATCH --cpus-per-task=1
#SBATCH --mem=4G
#SBATCH --time=7-00:00:00
#SBATCH --partition=public
module load nextflow
nextflow run main.nf -profile slurm
Nextflow Config
Define executor and resource defaults in your nextflow.config so each child job requests only what it needs for a short duration:
profiles {
slurm {
process.executor = 'slurm'
process.queue = 'short'
process.time = '2h'
process.memory = '8 GB'
process.cpus = 4
}
}
Per-Process Overrides
Override resources for individual processes that need more or less:
process {
withName: 'ALIGN' {
cpus = 8
memory = '16 GB'
time = '4h'
}
withName: 'FASTQC' {
cpus = 2
memory = '4 GB'
time = '1h'
}
}
Tips
- Keep child jobs short. Shorter walltimes improve your queue priority under fairshare and get scheduled faster.
- Request only the resources you need. Over-requesting memory or CPUs eats into your fairshare allocation for no benefit.
- Use
nextflow run -resumeif your main job times out. Nextflow will pick up from cached results automatically. - Check here more advanced usage such as error handling techniques, dynamic resources allocation and more: https://nilablueshirt.github.io/WMS-on-HPC/
- Nextflow is considered an advanced use case, please contact the Research Computing team for support if needed.