Tutorial: HPC / Slurm Execution#

This tutorial shows the standard Slurm pathway for large receptor sets on a shared filesystem.

What you’ll learn#

  • how to scaffold a project

  • how to switch from local execution to backend: slurm

  • how stilt run maps onto chunk files and array tasks

  • how to monitor and rerun a project safely

Scaffold a project#

stilt init /path/to/slv_project

Then edit config.yaml and populate receptors.csv.

Add Slurm execution settings#

execution:
  backend: slurm
  n_workers: 200
  partition: notchpeak
  account: my_account
  time: "00:20:00"
  mem_per_cpu: 2G
  cpus_per_task: 1
  array_parallelism: 50

Submit#

stilt run /path/to/slv_project

This normally returns as soon as sbatch accepts the array job.

--wait is available:

stilt run /path/to/slv_project --wait

but it is mainly a convenience for debugging or small demonstrations, not the usual HPC pattern.

What happens under the hood#

Each array task consumes one immutable chunk file through:

stilt push-worker /path/to/slv_project --chunk /path/to/task.txt --cpus 1

That means:

  • no worker-side queue polling is required for Slurm

  • each array task has a fixed work assignment

  • reruns are driven by output status and skip_existing

Monitor and rerun#

squeue -u "$USER"
stilt status /path/to/slv_project

To resume after interruption, simply run the same command again. Completed work is skipped by default.