Slurm ===== The ``slurm`` backend is the HPC path for large receptor sets on shared filesystems. It uses push dispatch: the coordinator writes immutable chunk files and submits a Slurm array job whose tasks each call ``stilt push-worker``. How it works ------------ Running ``stilt run`` with ``backend: slurm``: 1. Registers pending simulations in the output index. 2. Writes immutable chunk files under ``/chunks//``. 3. Renders a submission script under ``/slurm/``. 4. Submits a Slurm array job — one task per chunk — via ``sbatch``. Workers run ``stilt push-worker`` independently; no inter-task communication is required after submission. Configuration ------------- Minimal config: .. code-block:: yaml execution: backend: slurm n_workers: 200 partition: mypartition account: myaccount time: "00:20:00" Full example with common knobs: .. code-block:: yaml execution: backend: slurm n_workers: 200 partition: mypartition account: myaccount time: "00:20:00" mem: 2G cpus-per-task: 2 array_parallelism: 50 Key options ----------- ``n_workers`` Number of chunk shards to create, and therefore the maximum array-task count. Each worker processes its chunk sequentially; tune this alongside ``array_parallelism`` to control cluster load. ``cpus-per-task`` Passed through both to Slurm (``#SBATCH --cpus-per-task``) and to ``stilt push-worker --cpus`` so that each task uses a matching local process pool for within-chunk parallelism. ``array_parallelism`` Limits simultaneously active array tasks via the ``%N`` Slurm syntax (e.g. ``--array=0-199%50``). Useful for staying within fair-share limits. Any additional keys in the ``execution`` block are forwarded to ``sbatch`` as ``--key=value`` flags, with underscores converted to dashes. Submitting from the CLI ----------------------- Fire-and-forget (common for production runs): .. code-block:: bash stilt run /path/to/project The CLI prints the submitted job ID and returns after ``sbatch`` accepts the array. Block until the array finishes (useful for debugging or scripted workflows): .. code-block:: bash stilt run /path/to/project --wait Monitoring and reruns --------------------- Use the Slurm scheduler and the output index together: .. code-block:: bash squeue -u "$USER" stilt status /path/to/project Rerunning the same ``stilt run`` is safe — completed simulations are skipped by default. Use ``--no-skip`` only when you want to force a full rerun. Current constraint ------------------ The Slurm backend currently requires both the project root and the output output root to be local or shared filesystem paths. Cloud URIs (``s3://``, ``gs://``, etc.) are not supported for this backend.