arlmet.concat_by_time#

arlmet.concat_by_time(directory, output_directory, freq='1D', *, pattern='*', time_range=None, template='{time:%Y%m%d}_arl', sort=True)[source]#

Group every ARL file in a directory by valid time and concatenate each group.

Each input is assigned to a time bin from its first valid time — read from the file’s index record, not parsed from its name — floored to freq. All files in a bin are concatenated into one output file. This is the batch form of concat(): e.g. turning a directory of 6-hourly HRRR files into one file per day.

Parameters:
  • directory (path-like) – Directory to scan for input ARL files (non-recursive).

  • output_directory (path-like) – Directory to write the concatenated files into. Created if missing. Should differ from directory.

  • freq (str, default "1D") – Fixed-frequency pandas offset alias giving the size of each output chunk: "1D" = one file per day, "6h" = one per six hours, etc. Each input is binned by its first valid time floored to this frequency, so freq should be at least as long as any single input file’s span.

  • pattern (str, default "*") – Glob (relative to directory) selecting input files. Scope it to ARL files; every match must be a readable ARL file.

  • time_range (tuple of (start, end), optional) – Inclusive (start, end) filter on each file’s first valid time. Files whose first time falls outside the range are skipped.

  • template (str, default "{time:%Y%m%d}_arl") – str.format template for output filenames, given the bin start time as time (a pandas.Timestamp), e.g. "{time:%Y%m%d}_hrrr". It must encode enough resolution to keep bins distinct at freq.

  • sort (bool, default True) – Passed through to concat() for each group.

Returns:

The written output paths, one per non-empty time bin, in time order.

Return type:

list[pathlib.Path]

Raises:

ValueError – If pattern matches no files, or a matched file cannot be read as ARL. concat()’s grid/axis and duplicate-time checks also apply within each group.

Examples

Turn a directory of 6-hourly HRRR files into one file per day:

>>> import arlmet
>>> arlmet.concat_by_time(
...     "hrrr/",
...     "daily/",
...     freq="1D",
...     pattern="*_hrrr",
...     template="{time:%Y%m%d}_hrrr",
... )