arlmet.concat_by_time#
- arlmet.concat_by_time(directory, output_directory, freq='1D', *, pattern='*', time_range=None, template='{time:%Y%m%d}_arl', sort=True)[source]#
Group every ARL file in a directory by valid time and concatenate each group.
Each input is assigned to a time bin from its first valid time — read from the file’s index record, not parsed from its name — floored to
freq. All files in a bin are concatenated into one output file. This is the batch form ofconcat(): e.g. turning a directory of 6-hourly HRRR files into one file per day.- Parameters:
directory (path-like) – Directory to scan for input ARL files (non-recursive).
output_directory (path-like) – Directory to write the concatenated files into. Created if missing. Should differ from
directory.freq (str, default "1D") – Fixed-frequency pandas offset alias giving the size of each output chunk:
"1D"= one file per day,"6h"= one per six hours, etc. Each input is binned by its first valid time floored to this frequency, sofreqshould be at least as long as any single input file’s span.pattern (str, default "*") – Glob (relative to
directory) selecting input files. Scope it to ARL files; every match must be a readable ARL file.time_range (tuple of (start, end), optional) – Inclusive
(start, end)filter on each file’s first valid time. Files whose first time falls outside the range are skipped.template (str, default "{time:%Y%m%d}_arl") –
str.formattemplate for output filenames, given the bin start time astime(apandas.Timestamp), e.g."{time:%Y%m%d}_hrrr". It must encode enough resolution to keep bins distinct atfreq.sort (bool, default True) – Passed through to
concat()for each group.
- Returns:
The written output paths, one per non-empty time bin, in time order.
- Return type:
- Raises:
ValueError – If
patternmatches no files, or a matched file cannot be read as ARL.concat()’s grid/axis and duplicate-time checks also apply within each group.
Examples
Turn a directory of 6-hourly HRRR files into one file per day:
>>> import arlmet >>> arlmet.concat_by_time( ... "hrrr/", ... "daily/", ... freq="1D", ... pattern="*_hrrr", ... template="{time:%Y%m%d}_hrrr", ... )