Slurm#
The slurm backend is the HPC path for large receptor sets on shared
filesystems. It uses push dispatch: the coordinator writes immutable chunk
files and submits a Slurm array job whose tasks each call stilt
push-worker.
How it works#
Running stilt run with backend: slurm:
Registers pending simulations in the output index.
Writes immutable chunk files under
<output>/chunks/<batch_id>/.Renders a submission script under
<project>/slurm/.Submits a Slurm array job — one task per chunk — via
sbatch.
Workers run stilt push-worker independently; no inter-task communication
is required after submission.
Configuration#
Minimal config:
execution:
backend: slurm
n_workers: 200
partition: mypartition
account: myaccount
time: "00:20:00"
Full example with common knobs:
execution:
backend: slurm
n_workers: 200
partition: mypartition
account: myaccount
time: "00:20:00"
mem: 2G
cpus-per-task: 2
array_parallelism: 50
Key options#
n_workersNumber of chunk shards to create, and therefore the maximum array-task count. Each worker processes its chunk sequentially; tune this alongside
array_parallelismto control cluster load.cpus-per-taskPassed through both to Slurm (
#SBATCH --cpus-per-task) and tostilt push-worker --cpusso that each task uses a matching local process pool for within-chunk parallelism.array_parallelismLimits simultaneously active array tasks via the
%NSlurm syntax (e.g.--array=0-199%50). Useful for staying within fair-share limits.
Any additional keys in the execution block are forwarded to sbatch as
--key=value flags, with underscores converted to dashes.
Submitting from the CLI#
Fire-and-forget (common for production runs):
stilt run /path/to/project
The CLI prints the submitted job ID and returns after sbatch accepts the
array.
Block until the array finishes (useful for debugging or scripted workflows):
stilt run /path/to/project --wait
Monitoring and reruns#
Use the Slurm scheduler and the output index together:
squeue -u "$USER"
stilt status /path/to/project
Rerunning the same stilt run is safe — completed simulations are skipped
by default. Use --no-skip only when you want to force a full rerun.
Current constraint#
The Slurm backend currently requires both the project root and the output
output root to be local or shared filesystem paths. Cloud URIs
(s3://, gs://, etc.) are not supported for this backend.