Meteorology#
STILT requires gridded meteorological fields in NOAA ARL format. PYSTILT supports two ways to provide them:
Archive mode — point PYSTILT at a directory of ARL files you already have. PYSTILT globs for the right files at runtime.
Source mode — let PYSTILT download from NOAA archives automatically via the arl-met package.
Both modes stage the required files into each simulation’s compute-local directory. Both support optional spatial subsetting.
MetConfigOutput configuration stored in
config.yaml.MetStreamThe runtime object that resolves and stages required files for one named met stream.
Archive mode#
Point directory at your ARL file tree and tell PYSTILT how filenames
encode time:
import stilt
hrrr = stilt.MetConfig(
directory="/data/met/hrrr",
file_format="%Y%m%d_%H",
file_tres="1h",
n_min=2,
)
You can register multiple streams in one ModelConfig:
config = stilt.ModelConfig(
mets={
"hrrr": hrrr,
"gfs": stilt.MetConfig(
directory="/data/met/gfs",
file_format="gfs_%Y%m%d_%H",
file_tres="3h",
),
},
)
How file selection works#
At runtime, MetStream.required_files() computes the simulation time window,
derives the required filename patterns from file_format and file_tres,
and globs for matching ARL files recursively under directory. Each expected
time step triggers one targeted search rather than a full directory scan, which
keeps I/O proportional to the number of time steps needed rather than the size
of the archive.
If too few files are found, the run fails clearly instead of silently starting with incomplete meteorology.
Source mode (automatic download)#
Set source to an arlmet source name to have PYSTILT fetch ARL files from
NOAA archives automatically:
hrrr = stilt.MetConfig(
source="hrrr",
directory="/data/met/hrrr", # local cache directory
)
Available sources:
Name |
Product |
Domain |
Period |
|---|---|---|---|
|
HRRR 3 km analysis |
CONUS |
Jun 2019–present |
|
HRRR 3 km analysis v1 |
CONUS |
Jun 2015–2019 |
|
NAM 12 km analysis |
North America |
May 2007–present |
|
NAMS hybrid sigma-pressure |
CONUS / AK / HI |
Jan 2010–present |
|
GDAS 1-degree global |
Global |
Dec 2004–present |
|
GDAS 0.5-degree global |
Global |
Sep 2007–mid 2019 |
|
GFS 0.25-degree global |
Global |
Jun 2019–present |
|
NCEP/NCAR Reanalysis 2.5-degree |
Global |
1948–present |
|
NCEP North American Regional Reanalysis 32 km |
North America |
1979–2019 |
Source-specific options are passed as inline fields. For example, nams
supports a domain parameter:
nams_ak = stilt.MetConfig(
source="nams",
domain="ak", # passed to NAMSSource(domain="ak")
directory="/data/met/nams_ak",
)
The backend field selects the download source (default "s3"):
MetConfig(source="gdas1", directory="/data/met/gdas1", backend="ftp")
Downloaded files are cached in directory. Re-running with the same config
skips already-downloaded files.
Note
Download mode requires fsspec and s3fs (for the default S3
backend). Install with: pip install pystilt[cloud]
Subsetting#
Setting subgrid_enable=True restricts the meteorology to a spatial
bounding box before staging. This is strongly recommended for global products
(GFS, GDAS, Reanalysis) and useful on HPC where compute nodes have limited
memory.
from stilt import Bounds, MetConfig
hrrr = MetConfig(
source="hrrr",
directory="/data/met/hrrr",
subgrid_enable=True,
subgrid_bounds=Bounds(xmin=-114, xmax=-110, ymin=39, ymax=42),
subgrid_buffer=0.5, # degrees added on each side (default 0.2)
)
In source mode, subsetting is applied during download — the cropped file is cached directly so subsequent runs reuse it.
In archive mode, subsetting is applied the first time a file is needed.
Cropped copies are cached in subgrid_dir (defaults to
<directory>/subgrid) and reused by all simulations that share the same
met stream. Set subgrid_dir explicitly to use a shared cache across
multiple projects:
MetConfig(
directory="/data/met/hrrr",
file_format="%Y%m%d_%H",
file_tres="1h",
subgrid_enable=True,
subgrid_bounds=Bounds(xmin=-114, xmax=-110, ymin=39, ymax=42),
subgrid_dir="/scratch/met_subgrid/hrrr_slv",
)
Use subgrid_levels to also reduce the number of vertical levels:
MetConfig(
...,
subgrid_levels=20, # keep the lowest 20 levels
)
Staging into compute-local space#
Simulation.met_files stages the selected meteorology files into the
simulation’s compute-local met/ directory using link-or-copy semantics.
This matters when:
your archive is read-only
workers use a slower shared filesystem for output
HYSPLIT should read from a short local path in scratch space