API Reference#
Core Data Structures#
Block#
- class fips.Block(data, name=None, index=None, dtype=None, copy=False)[source]#
Single data block with a named Series and consistent index.
A Block wraps a pandas Series and can be initialized from an existing Series or from raw values.
Blocks are the fundamental building units of fips. Inverse problems can be customized by creating Blocks with specific indices and names to represent different state or observation components relevant to the application.
- data#
The underlying Series containing the block data.
- Type:
pd.Series
- index#
Index for the block.
- Type:
pd.Index
- values#
The underlying data values as a NumPy array.
- Type:
np.ndarray
- xs(key, axis=0, level=None, drop_level=True)#
Cross-select data based on index values.
- reindex(new_index, fill_value=0.0)#
Reindex the block to new row indices, filling missing values with fill_value.
- round_index(decimals, axis=0)#
Round the index and to a specified number of decimal places for alignment.
- copy()#
Return a copy of the Block.
- to_series(add_block_level=False)[source]#
Convert to a Series, optionally adding block levels to the index.
- to_numpy()#
Get the underlying data as a NumPy array.
-
data:
Series#
- __init__(data, name=None, index=None, dtype=None, copy=False)[source]#
Initialize a Block.
- Parameters:
data (pd.Series, Block, or array-like) – Data for the block. If Block, creates a copy.
name (str, optional) – Name for the block. If None, uses data.name.
index (pd.Index, optional) – Index for the block. If None, uses data.index.
dtype (dtype, optional) – Data type to force.
copy (bool, default False) – Whether to copy the underlying data.
- to_series(add_block_level=False)[source]#
Return the underlying Series data.
- Parameters:
add_block_level (bool, default False) – Whether to add a ‘block’ level to the index with the block name.
- Returns:
The underlying Series data, optionally with a ‘block’ level added to the index.
- Return type:
pd.Series
Vector#
- class fips.Vector(data, name=None, index=None, dtype=None, copy=False)[source]#
State or observation vector composed of one or more Block objects.
A Vector organizes one or more Blocks (prior, posterior, observations, etc.) into a single hierarchical structure.
Vectors are used to represent the full state or observation space and are the 1D matrix components in the inversion framework.
- data#
The underlying Series containing the vector data.
- Type:
pd.Series
- index#
Index for the vector.
- Type:
pd.Index
- values#
The underlying data values as a NumPy array.
- Type:
np.ndarray
- xs(key, axis=0, level=None, drop_level=True)#
Cross-select data based on index values.
- reindex(new_index, fill_value=0.0)#
Reindex the vector to new row indices, filling missing values with fill_value.
- round_index(decimals, axis=0)#
Round the index and to a specified number of decimal places for alignment.
- copy()#
Return a copy of the Vector.
- to_series()#
Convert to a Series.
- to_numpy()#
Get the underlying data as a NumPy array.
-
data:
Series#
- __init__(data, name=None, index=None, dtype=None, copy=False)[source]#
Initialize a Vector.
- Parameters:
data (pd.Series, Vector, Block, Sequence[Block | pd.Series], or array-like) – Data for the vector.
name (str, optional) – Name for the Vector.
index (pd.Index, optional) – Index for the Vector. If None, uses data.index. Index must have a ‘block’ level if data is a Series.
dtype (dtype, optional) – Data type to force.
copy (bool, default False) – Whether to copy the underlying data.
- property blocks: _BlockAccessor#
Accessor for retrieving Block instances from the Vector.
MatrixBlock#
- class fips.MatrixBlock(data, row_block=None, col_block=None, name=None, index=None, columns=None, dtype=None, copy=False, sparse=False)[source]#
Single 2D data block with row and column block names.
A MatrixBlock wraps a pandas DataFrame and can be initialized from an existing DataFrame or from raw values.
MatrixBlocks are the fundamental 2D building units of fips, used to compose larger Matrix objects. MatrixBlocks represent the relationships between specific row and column blocks (e.g., state-to-observation mappings in forward operators, or covariance submatrices between specific state components). By organizing data into MatrixBlocks, users can create modular and interpretable representations of complex inverse problems, with clear semantics for how different components interact.
- data#
The underlying DataFrame containing the block’s data.
- Type:
pd.DataFrame
- index#
Index for the rows of the DataFrame.
- Type:
pd.Index
- columns#
Index for the columns of the DataFrame.
- Type:
pd.Index
- values#
The underlying data values, whether sparse or dense.
- Type:
np.ndarray
- xs(key, axis=0, level=None, drop_level=True)#
Cross-select data based on index/column values.
- reindex(new_index, new_columns, fill_value=0.0)#
Reindex the block to new row and column indices, filling missing values with fill_value.
- round_index(decimals, axis='both')#
Round the index and/or columns to a specified number of decimal places for alignment.
- copy()#
Return a copy of the MatrixBlock.
- to_frame(add_block_level=False)[source]#
Convert to a DataFrame, optionally adding block levels to the index and columns.
- to_dense()#
Return a copy of the block with dense internal storage.
- to_sparse(threshold=None)#
Return a copy of the block with sparse internal storage, zeroing values below the threshold.
- to_numpy()#
Get the underlying data as a NumPy array.
-
data:
DataFrame#
- __init__(data, row_block=None, col_block=None, name=None, index=None, columns=None, dtype=None, copy=False, sparse=False)[source]#
Initialize a MatrixBlock.
- Parameters:
data (np.ndarray or pd.DataFrame or MatrixBlock or scalar) – 2D data representing the block. If MatrixBlock, creates a copy.
row_block (str, optional) – Name of the row block (e.g., “state”, “obs”). Required if data is not a MatrixBlock.
col_block (str, optional) – Name of the column block (e.g., “state”, “obs”). Required if data is not a MatrixBlock.
name (str, optional) – Name for the block. If None, defaults to “{row_block}_{col_block}”.
index (pd.Index, optional) – Index for the rows of the DataFrame.
columns (pd.Index, optional) – Index for the columns of the DataFrame. If None, uses the same as index.
dtype (data type, optional) – Data type to force.
copy (bool, optional) – Whether to copy the data.
sparse (bool, default False) – If True, store the block in pandas sparse format. Sparsification is applied after initialization. Use threshold zeroing in your builder before passing data here for best results.
- to_frame(add_block_level=False)[source]#
Convert to DataFrame with optional block level.
- Parameters:
add_block_level (bool, default False) – Whether to add ‘block’ levels to the index and columns for the row and column blocks.
- Returns:
The DataFrame representation of the block, with optional block levels in the index and columns.
- Return type:
pd.DataFrame
Matrix#
- class fips.Matrix(data, name=None, index=None, columns=None, dtype=None, copy=None, sparse=False)[source]#
Base class for all matrix-like objects in the inversion framework.
Wraps a pandas DataFrame and ensures consistent index handling. Matrices represent 2D components of the inversion problem, such as forward operators and covariance matrices.
- data#
The underlying DataFrame containing the matrix data.
- Type:
pd.DataFrame
- index#
Index for the rows of the DataFrame.
- Type:
pd.MultiIndex
- columns#
Index for the columns of the DataFrame.
- Type:
pd.MultiIndex
- values#
The underlying data values, whether sparse or dense.
- Type:
np.ndarray
- xs(key, axis=0, level=None, drop_level=True)#
Cross-select data based on index/column values.
- reindex(new_index, new_columns, fill_value=0.0)#
Reindex the matrix to new row and column indices, filling missing values with fill_value.
- round_index(decimals, axis='both')#
Round the index and/or columns to a specified number of decimal places for alignment.
- copy()#
Return a copy of the Matrix.
- to_frame()#
Convert to a DataFrame.
- to_dense()#
Return a copy of the matrix with dense internal storage.
- to_sparse(threshold=None)#
Return a copy of the matrix with sparse internal storage, zeroing values below the threshold.
- to_numpy()#
Get the underlying data as a NumPy array.
-
data:
DataFrame#
- __init__(data, name=None, index=None, columns=None, dtype=None, copy=None, sparse=False)[source]#
Initialize a Matrix.
- Parameters:
data (np.ndarray or pd.DataFrame or Matrix or scalar) – 2D data representing the matrix.
name (str, optional) – Name for the Matrix.
index (pd.MultiIndex) – Index for the rows of the DataFrame.
columns (pd.MultiIndex, optional) – Index for the columns of the DataFrame. If None, uses the same as index.
dtype (data type, optional) – Data type to force.
copy (bool, optional) – Whether to copy the data.
sparse (bool, default False) – If True, store the assembled matrix in pandas sparse format. Sparsification is applied after block assembly; use threshold zeroing in your builder before passing data here.
- Returns:
Instance of Matrix wrapping the DataFrame.
- Return type:
- __getitem__(block)[source]#
Get the submatrix DataFrame for the given (row_block, col_block) tuple.
- Return type:
DataFrame
- property blocks: _MatrixBlockAccessor#
Accessor for retrieving MatrixBlock instances from the Matrix.
CovarianceMatrix#
- class fips.CovarianceMatrix(data, name=None, index=None, columns=None, dtype=None, copy=None, sparse=False)[source]#
Represents a symmetric Covariance Matrix.
Covariance matrices are used to represent error covariances in the inversion framework. They can be constructed from variances and correlation matrices.
- data#
The underlying DataFrame containing the matrix data.
- Type:
pd.DataFrame
- index#
Index for the rows of the CovarianceMatrix.
- Type:
pd.MultiIndex
- columns#
Index for the columns of the CovarianceMatrix.
- Type:
pd.MultiIndex
- variances#
The variances (diagonal elements) of the covariance matrix, represented as a Vector.
- Type:
- values#
The underlying data values as a NumPy array.
- Type:
np.ndarray
- force_symmetry(keep='lower')[source]#
Force the matrix to be perfectly symmetric by copying one triangle to the other.
- xs(key, axis=0, level=None, drop_level=True)#
Cross-select data based on index/column values.
- reindex(new_index, new_columns, fill_value=0.0)#
Reindex the matrix to new row and column indices, filling missing values with fill_value.
- round_index(decimals, axis='both')#
Round the index and/or columns to a specified number of decimal places for alignment.
- copy()#
Return a copy of the CovarianceMatrix.
- to_frame(add_block_level=False)#
Convert to a DataFrame, optionally adding block levels to the index and columns.
- to_dense()#
Return a copy of the matrix with dense internal storage.
- to_sparse(threshold=None)#
Return a copy of the matrix with sparse internal storage, zeroing values below the threshold.
- to_numpy()#
Get the underlying data as a NumPy array.
- force_symmetry(keep='lower')[source]#
Force the matrix to be perfectly symmetric by copying one triangle to the other.
Useful for eliminating floating-point asymmetry.
- Parameters:
keep ({'lower', 'upper'}, default 'lower') – Which triangle of the matrix to preserve and copy.
- Returns:
A new, perfectly symmetric covariance matrix.
- Return type:
ForwardOperator#
- class fips.ForwardOperator(data, name=None, index=None, columns=None, dtype=None, copy=None, sparse=False)[source]#
Forward operator matrix mapping state vectors to observation space.
A ForwardOperator wraps a pandas DataFrame and provides methods to convolve state vectors through the operator to produce modeled observations.
The foward operator, or Jacobian matrix, is a key component of inverse problems. It defines how changes in the state vector affect the observations. The rows correspond to observations and the columns to state variables.
- data#
The underlying DataFrame containing the operator data.
- Type:
pd.DataFrame
- index#
Index for the rows of the ForwardOperator.
- Type:
pd.MultiIndex
- obs_index#
Alias for index, representing the observation space index.
- Type:
pd.MultiIndex
- columns#
Index for the columns of the ForwardOperator.
- Type:
pd.MultiIndex
- state_index#
Alias for columns, representing the state space index.
- Type:
pd.MultiIndex
- values#
The underlying data values as a NumPy array.
- Type:
np.ndarray
- convolve(state, round_index=None, verify_overlap=True)[source]#
Convolve a state vector through the forward operator.
- xs(key, axis=0, level=None, drop_level=True)#
Cross-select data based on index/column values.
- reindex(new_index, new_columns, fill_value=0.0)#
Reindex the matrix to new row and column indices, filling missing values with fill_value.
- round_index(decimals, axis='both')#
Round the index and/or columns to a specified number of decimal places for alignment.
- copy()#
Return a copy of the ForwardOperator.
- to_frame(add_block_level=False)#
Convert to a DataFrame, optionally adding block levels to the index and columns.
- to_dense()#
Return a copy of the matrix with dense internal storage.
- to_sparse(threshold=None)#
Return a copy of the matrix with sparse internal storage, zeroing values below the threshold.
- to_numpy()#
Get the underlying data as a NumPy array.
- property state_index: Index#
Return the state space index (columns).
- property obs_index: Index#
Return the observation space index (rows).
Inverse Problem#
- class fips.InverseProblem(obs, prior, forward_operator, modeldata_mismatch, prior_error, constant=None, round_index=6)[source]#
Inverse problem combining observations, priors, and forward model.
Organizes state vectors, observations, forward operators, and error covariances into a unified framework for solving inverse problems via different estimators.
- forward_operator#
Forward operator mapping state space to observation space.
- Type:
- modeldata_mismatch#
Covariance matrix representing model-data mismatch (observation error).
- Type:
- prior_error#
Covariance matrix representing prior error.
- Type:
- constant#
Optional constant term added to the forward model (e.g., background or bias).
- estimator#
The fitted estimator after solving the problem. Initially None until .solve() is called.
- Type:
Estimator, optional
- posterior_error#
Posterior error covariance after solving the problem.
- Type:
- get_block(component, block, crossblock=None)[source]#
Retrieve a specific block of data from a component (Vector or Matrix).
- solve(estimator, \*\*kwargs)[source]#
Solve the inverse problem using the specified estimator and store the fitted estimator.
- __init__(obs, prior, forward_operator, modeldata_mismatch, prior_error, constant=None, round_index=6)[source]#
Initialize the inverse problem.
- Parameters:
obs (VectorLike) – Observation vector.
prior (VectorLike) – Prior state vector.
forward_operator (MatrixLike) – Forward operator mapping state space to observation space.
modeldata_mismatch (MatrixLike) – Covariance matrix representing model-data mismatch (observation error).
prior_error (MatrixLike) – Covariance matrix representing prior error.
constant (VectorLike or float, optional) – Optional constant term added to the forward model (e.g., background or bias).
round_index (int, optional) – Number of decimal places to round to. If None, no rounding is performed.
- property state_index: Index#
Return the state space index.
- property obs_index: Index#
Return the observation space index.
- get_block(component, block, crossblock=None)[source]#
Get block from a component (Vector or Matrix).
- Parameters:
component (str) – Name of the component (‘obs’, ‘prior’, ‘forward_operator’, ‘modeldata_mismatch’, ‘prior_error’, or ‘constant’).
block (str) – Name of the block to retrieve.
crossblock (str, optional) – For matrices, the name of the cross block (e.g., ‘state’ for forward_operator). If None, defaults to the same as ‘block’.
- Returns:
The requested block of data.
- Return type:
pd.Series or pd.DataFrame
- solve(estimator, **kwargs)[source]#
Solve the inverse problem using the specified estimator.
- Parameters:
- Returns:
The InverseProblem instance with the estimator fitted.
- Return type:
Self
- property posterior_error: CovarianceMatrix#
Posterior error covariance.
Estimators#
- class fips.Estimator(z, x_0, H, S_0, S_z, c=None)[source]#
Base inversion estimator class.
- z#
Observed data.
- Type:
np.ndarray
- x_0#
Prior model state estimate.
- Type:
np.ndarray
- H#
Forward operator.
- Type:
np.ndarray
- S_0#
Prior error covariance.
- Type:
np.ndarray
- S_z#
Model-data mismatch covariance.
- Type:
np.ndarray
- x_hat#
Posterior mean model state estimate (solution).
- Type:
np.ndarray
- S_hat#
Posterior error covariance.
- Type:
np.ndarray
- y_hat#
Posterior modeled observations.
- Type:
np.ndarray
- y_0#
Prior modeled observations.
- Type:
np.ndarray
- K#
Kalman gain.
- Type:
np.ndarray
- A#
Averaging kernel.
- Type:
np.ndarray
- U_red#
Reduced uncertainty.
- Type:
np.ndarray
- __init__(z, x_0, H, S_0, S_z, c=None)[source]#
Initialize the Estimator object.
- Parameters:
z (np.ndarray) – Observed data.
x_0 (np.ndarray) – Prior model state estimate.
H (np.ndarray) – Forward operator.
S_0 (np.ndarray) – Prior error covariance.
S_z (np.ndarray) – Model-data mismatch covariance.
c (np.ndarray or float, optional) – Constant data, defaults to 0.0.
- forward(x)[source]#
Forward model calculation.
\[y = Hx + c\]- Parameters:
x (np.ndarray) – State vector.
- Returns:
Model output (Hx + c).
- Return type:
np.ndarray
- residual(x)[source]#
Forward model residual.
\[r = z - (Hx + c)\]- Parameters:
x (np.ndarray) – State vector.
- Returns:
Residual (z - (Hx + c)).
- Return type:
np.ndarray
- leverage(x)[source]#
Calculate the leverage matrix.
Which observations are likely to have more impact on the solution.
\[L = Hx ((Hx)^T (H S_0 H^T + S_z)^{-1} Hx)^{-1} (Hx)^T (H S_0 H^T + S_z)^{-1}\]- Parameters:
x (np.ndarray) – State vector.
- Returns:
Leverage matrix.
- Return type:
np.ndarray
- abstract cost(x)[source]#
Cost/loss/misfit function.
- Parameters:
x (np.ndarray) – State vector.
- Returns:
Cost value.
- Return type:
- abstract property x_hat: ndarray#
Posterior mean model state estimate (solution).
- Returns:
Posterior state estimate.
- Return type:
np.ndarray
- abstract property S_hat: ndarray#
Posterior error covariance matrix.
- Returns:
Posterior error covariance matrix.
- Return type:
np.ndarray
- property y_hat: ndarray#
Posterior mean observation estimate.
\[\begin{split}\\hat{y} = H \\hat{x} + c\end{split}\]- Returns:
Posterior observation estimate.
- Return type:
np.ndarray
- property y_0: ndarray#
Prior mean data estimate.
\[\begin{split}\\hat{y}_0 = H x_0 + c\end{split}\]- Returns:
Prior data estimate.
- Return type:
np.ndarray
- property K#
Kalman gain matrix.
\[K = (H S_0)^T (H S_0 H^T + S_z)^{-1}\]- Returns:
Kalman gain matrix.
- Return type:
np.ndarray
- property A#
Averaging kernel matrix.
\[A = KH = (H S_0)^T (H S_0 H^T + S_z)^{-1} H\]- Returns:
Averaging kernel matrix.
- Return type:
np.ndarray
- property DOFS: float#
Degrees Of Freedom for Signal (DOFS).
\[DOFS = Tr(A)\]- Returns:
Degrees of Freedom value.
- Return type:
- property reduced_chi2: float#
Reduced Chi-squared statistic. Tarantola (1987).
\[\begin{split}\\chi^2 = \\frac{1}{n_z} ((z - H\\hat{x})^T S_z^{-1} (z - H\\hat{x}) + (\\hat{x} - x_0)^T S_0^{-1} (\\hat{x} - x_0))\end{split}\]Note
I can’t find a copy of Tarantola (1987) to verify this equation, but it appears in Kunik et al. (2019) https://doi.org/10.1525/elementa.375
- Returns:
Reduced Chi-squared value.
- Return type:
- property R2: float#
Coefficient of determination (R-squared).
\[\begin{split}R^2 = corr(z, H\\hat{x})^2\end{split}\]- Returns:
R-squared value.
- Return type:
- property RMSE: float#
Root mean square error (RMSE).
\[\begin{split}RMSE = \\sqrt{\\frac{(z - H\\hat{x})^2}{n_z}}\end{split}\]- Returns:
RMSE value.
- Return type:
Inversion estimators for solving inverse problems.
This module contains Bayesian and regularized estimators for state estimation in linear inverse problems, computing posterior distributions and diagnostics.
- class fips.estimators.BayesianSolver(z, x_0, H, S_0, S_z, c=None, rf=1.0)[source]#
Bayesian inversion estimator class.
This class implements a Bayesian inversion framework for solving inverse problems, also known as the batch method.
- __init__(z, x_0, H, S_0, S_z, c=None, rf=1.0)[source]#
Initialize inversion object.
- Parameters:
z (np.ndarray) – Observed data
x_0 (np.ndarray) – Prior model estimate
H (np.ndarray) – Forward operator
S_0 (np.ndarray) – Prior error covariance
S_z (np.ndarray) – Model-data mismatch covariance
c (np.ndarray | float, optional) – Constant data, defaults to 0.0
rf (float, optional) – Regularization factor, by default 1.0
- cost(x)[source]#
Cost function.
\[\begin{split}J(x) = \\frac{1}{2}(x - x_0)^T S_0^{-1}(x - x_0) + \\frac{1}{2}(z - Hx - c)^T S_z^{-1}(z - Hx - c)\end{split}\]
- property x_hat#
Posterior Mean Model Estimate (solution).
\[\begin{split}\\hat{x} = x_0 + K(z - Hx_0 - c)\end{split}\]
- property S_hat#
Posterior Error Covariance Matrix.
\[\begin{split}\\hat{S} = (H^T S_z^{-1} H + S_0^{-1})^{-1} = S_0 - (H S_0)^T(H S_0 H^T + S_z)^{-1}(H S_0)\end{split}\]
Pipeline#
- class fips.pipeline.InversionPipeline(config, problem, estimator)[source]#
Blueprint for inversion.
This class defines the standard workflow for running an inversion, including methods for loading observations and priors, building the forward operator and covariance matrices, and executing the solve process.
Subclasses should implement the abstract methods to handle the specifics of data loading and covariance construction for their particular problem domain.
- aggregate_obs_space(obs, forward_operator, modeldata_mismatch, constant)[source]#
Aggregate the observation space.
- filter_state_space(obs, prior)[source]#
Align or trim the state space before building covariances.
Optionally filter the state space by removing intervals with insufficient observations or simulations.
- abstract get_modeldata_mismatch(obs)[source]#
Get model-data mismatch covariance matrix.
- Return type:
- aggregate_obs_space(obs, forward_operator, modeldata_mismatch, constant)[source]#
Aggregate the observation space.
Optionally aggregate the observation space (e.g. hourly → daily) by applying the same aggregation to the obs vector, forward operator, model-data mismatch covariance, and constant term.
- Return type:
tuple[Vector,ForwardOperator,CovarianceMatrix,Vector|None]
- run(**kwargs)[source]#
Execute the standard inversion workflow.
- Return type:
TypeVar(_Problem, bound=InverseProblem)
Flux Inversion#
- class fips.problems.flux.pipeline.FluxInversionPipeline(config)[source]#
Abstract pipeline for atmospheric flux inversions.
- config#
Configuration object containing pipeline settings.
- Type:
Any
- problem#
The solved flux inversion problem (available after calling run()).
- Type:
FluxProblem
- get_obs()#
Get observation vector (abstract, must be implemented by subclass).
- get_prior()#
Get prior state vector (abstract, must be implemented by subclass).
- filter_state_space(obs, prior)[source]#
Filter state space by removing intervals with insufficient observations.
- get_forward_operator(obs, prior)#
Get forward operator matrix (abstract, must be implemented by subclass).
- get_prior_error(prior)#
Get prior error covariance matrix (abstract, must be implemented by subclass).
- get_modeldata_mismatch(obs)#
Get model-data mismatch covariance matrix (abstract, must be implemented by subclass).
- get_constant(obs)#
Get optional constant offset vector.
- aggregate_obs_space(obs, forward_operator, modeldata_mismatch, constant)#
Aggregate the observation space.
- get_inputs()#
Gather all input components for the inverse problem.
-
problem:
FluxProblem#
- filter_state_space(obs, prior)[source]#
Filter state space by removing intervals with insufficient observations.
- class fips.problems.flux.visualization.FluxPlotter(inversion)[source]#
Plotting interface for FluxInversion results.
Provides methods for visualizing prior/posterior fluxes and concentration timeseries.
- inversion#
The flux inversion problem to visualize.
- Type:
FluxProblem
- fluxes(time='mean', truth=None, x_dim='lon', y_dim='lat', time_dim='time', sites=False, sites_kwargs=None, \*\*kwargs)[source]#
Plot prior and posterior flux maps.
- concentrations(location=None, location_dim='obs_location', \*\*kwargs)[source]#
Plot observed, prior, and posterior concentrations.
- __init__(inversion)[source]#
Initialize with a FluxInversion instance.
- Parameters:
inversion (FluxInversion) – The inverse problem to visualize.
- fluxes(time='mean', truth=None, x_dim='lon', y_dim='lat', time_dim='time', sites=False, sites_kwargs=None, **kwargs)[source]#
Plot prior and posterior flux maps.
- Parameters:
time (str, int, or pd.Timestamp, default 'mean') – Time to plot: ‘mean’, ‘std’, time index, or timestamp.
truth (pd.Series, optional) – Truth fluxes for comparison.
x_dim (str, default 'lon') – Name of the x-coordinate dimension.
y_dim (str, default 'lat') – Name of the y-coordinate dimension.
time_dim (str, default 'time') – Name of the time dimension.
sites (bool or dict, optional) – Site locations to overlay: dict mapping site IDs to (lat, lon).
sites_kwargs (dict, optional) – Additional plotting kwargs for site markers.
**kwargs – Additional arguments passed to xarray plotting.
- Returns:
fig, axes – The created figure and axes.
- Return type:
matplotlib Figure and Axes
Utilities#
convolve#
- fips.convolve(state, forward_operator, round_index=None, verify_overlap=True)[source]#
Convolve a state vector with a forward operator matrix.
- Parameters:
state (Vector, pd.Series, or np.ndarray) – State vector to convolve. Can be a Vector, Series, or 1D array.
forward_operator (ForwardOperator or pd.DataFrame) – Forward operator matrix to convolve with. Can be a ForwardOperator or a DataFrame.
round_index (int, optional) – Number of decimal places to round indices for alignment. If None, no rounding is done.
verify_overlap (bool, optional) – Whether to verify that the state index overlaps with the operator’s state index. Defaults to True.
- Returns:
The convolved observation vector.
- Return type:
pd.Series
Indexes#
Index validation and manipulation utilities.
This module provides utilities for checking index overlap, promoting indices, and sanitizing index types for consistent handling across data structures.
- fips.indexes.apply_to_index(func)[source]#
Apply a single-index function to each level of a MultiIndex.
- Parameters:
func (function) – A function that takes a single pd.Index and returns a modified pd.Index. This function will be applied to each level of a MultiIndex, or directly to a single Index.
- Returns:
A wrapper function that applies the given function to each level of a MultiIndex or to a single Index.
- Return type:
function
- fips.indexes.assign_block(index, block)[source]#
Assign or overwrite a ‘block’ level in the index.
- Parameters:
index (pd.Index) – The original index to which the block level will be assigned.
block (str) – The block name to assign to the index.
- Returns:
A new index with the ‘block’ level assigned to the specified block name.
- Return type:
pd.Index
- fips.indexes.outer_align_levels(dfs, axis=0, fill_value=nan)[source]#
Align MultiIndexes by performing an OUTER JOIN on level names.
Strictly preserves the order of appearance (First-Seen Priority).
- Parameters:
dfs (list of pd.DataFrame) – The DataFrames to align.
axis (int or 'both', default 0) – The axis along which to align the DataFrames. 0 or ‘index’ for row alignment, 1 or ‘columns’ for column alignment, ‘both’ for both axes.
fill_value (scalar, default np.nan) – The value to use for missing entries after alignment. By default, missing entries are filled with NaN.
- Returns:
A list of DataFrames with aligned MultiIndexes along the specified axis.
- Return type:
list of pd.DataFrame
- fips.indexes.overlaps(target_idx, available_idx)[source]#
Check if target index overlaps with available index.
Returns True if fully covered, ‘partial’ if partially covered, and False if no overlap.
- Parameters:
target_idx (pd.Index) – The index we want to check for coverage.
available_idx (pd.Index) – The index that represents available data.
- Returns:
True if target_idx is fully covered by available_idx, ‘partial’ if partially covered, and False if no overlap.
- Return type:
bool or ‘partial’
- fips.indexes.round_index(index, decimals)[source]#
Round float indices to specified decimals.
- Parameters:
index (pd.Index) – The index to round.
decimals (int) – The number of decimal places to round to.
- Returns:
A new index with float values rounded to the specified number of decimals. Non-float indices are returned unchanged.
- Return type:
pd.Index
Kernels#
Covariance kernel functions.
This module provides kernel functions for generating covariance matrices, such as exponential decay, constant correlation, and other spatial/temporal correlation structures.
- fips.kernels.RaggedTimeDecay(time_dim, scale, decay_func=<function _exponential_decay>)[source]#
Create a ragged temporal decay kernel.
Defaults to exponential decay, but can accept any math function.
- Parameters:
time_dim (str) – Name of the time dimension in the input DataFrame.
scale (str or pd.Timedelta) – Scale parameter for the decay function. If a string is provided, it will be converted to a pd.Timedelta.
decay_func (callable, optional) – A function that takes a distance matrix and a scale parameter and returns a decay matrix. Defaults to the exponential decay function.
- Returns:
A kernel function that can be applied to a DataFrame to compute the temporal decay matrix based on the specified time dimension and scale.
- Return type:
function
- fips.kernels.GridTimeDecay(scale, decay_func=<function _exponential_decay>)[source]#
Create a grid temporal decay kernel.
- Parameters:
scale (str or pd.Timedelta) – Scale parameter for the decay function. If a string is provided, it will be converted to a pd.Timedelta.
decay_func (callable, optional) – A function that takes a distance matrix and a scale parameter and returns a decay matrix. Defaults to the exponential decay function.
- Returns:
A kernel function that can be applied to a DataFrame to compute the temporal decay matrix based on the specified scale, treating all time points as part of a single grid (i.e., not grouped by any time dimension).
- Return type:
function
- fips.kernels.GridSpatialDecay(lat_dim, lon_dim, scale, decay_func=<function _exponential_decay>)[source]#
Create a grid spatial decay kernel (Haversine).
- Parameters:
lat_dim (str) – Name of the latitude dimension in the input DataFrame.
lon_dim (str) – Name of the longitude dimension in the input DataFrame.
scale (float) – Scale parameter for the decay function, in the same units as the distance matrix (e.g., kilometers if using Haversine distance).
decay_func (callable, optional) – A function that takes a distance matrix and a scale parameter and returns a decay matrix. Defaults to the exponential decay function.
- Returns:
A kernel function that can be applied to a DataFrame to compute the spatial decay matrix based on the specified latitude and longitude dimensions and scale, using Haversine distance for spatial separation.
- Return type:
function
Filters#
Data filtering and selection utilities.
This module provides functions for filtering observations and state vectors based on various criteria, such as data density, time intervals, and quality control thresholds.
- fips.filters.enough_obs_per_interval(index, intervals, threshold, level=None)[source]#
Determine which observations have enough data points per time interval.
- Parameters:
- Returns:
Boolean mask indicating which observations meet the threshold.
- Return type:
Metrics#
Distance and similarity metrics.
This module provides functions for calculating distances and similarities between data points, such as Haversine distance for geographic coordinates and time differences for temporal data.
- fips.metrics.haversine_matrix(lats, lons, earth_radius=None, deg=True)[source]#
Calculate the pairwise Haversine distance matrix between a set of coordinates.
- Parameters:
lats (array-like) – 1D array-like of latitude coordinates in degrees (if deg=True) else in radians.
lons (array-like) – 1D array-like of longitude coordinates in degrees (if deg=True) else in radians.
earth_radius (float, optional) – Radius of the Earth in kilometers, by default 6371.0 km.
deg (bool, optional) – If True, input coordinates are in degrees and will be converted to radians. If False, input coordinates are assumed to be in radians, by default True.
- Returns:
A 2D NumPy array (matrix) where the element at (i, j) is the Haversine distance between the i-th and j-th coordinate. The diagonal of the matrix will be zero.
- Return type:
np.ndarray
Aggregators#
Data aggregation utilities for inverse problems.
This module provides functions for aggregating and integrating data over time and space, particularly useful for processing observations and state vectors into compatible resolutions.
- fips.aggregators.integrate_over_time_bins(data, time_bins, time_dim='time')[source]#
Integrate data over time bins.
- Parameters:
data (pd.DataFrame | pd.Series) – Data to integrate.
time_bins (pd.IntervalIndex) – Time bins for integration.
time_dim (str, optional) – Time dimension name, by default ‘time’
- Returns:
Integrated footprint. The bin labels are set to the left edge of the bin.
- Return type:
pd.DataFrame | pd.Series
- class fips.aggregators.ObsAggregator(by=None, level=None, freq=None, func='mean', blocks=None)[source]#
Aggregates the observation space of an inverse problem.
Builds a sparse (n_agg x n_obs) weight matrix W and applies it to each component of the problem:
z_agg = W @ z # aggregated observations H_agg = W @ H # aggregated forward operator S_z_agg = W @ S_z @ W.T # covariance propagation c_agg = W @ c # aggregated constant (if vector)
For
func='mean'each non-zero entry in row i equals 1/nᵢ (the reciprocal of the group size), soW @ zyields group means andW @ S_z @ W.Tscales variances by 1/nᵢ². Forfunc='sum'every entry is 1. Only'mean'and'sum'are supported because other functions do not have a well-defined covariance propagation rule.Grouping interface#
Exactly one of
byorlevelmust be provided:by: an index level name (str), a list of level names, or a callable that accepts the obspd.Indexand returns group labels.level+freq: resample a datetime index level at the given pandas offset alias (e.g.level='obs_time', freq='D'). All other index levels are preserved as exact-match grouping keys.
When the obs index has a
'block'level it is always prepended as a grouping key, ensuring observations from different blocks are never merged.Partial aggregation#
blocksrestricts aggregation to the named block(s). Observations belonging to other blocks are passed through unchanged via identity rows in W, so the returned arrays cover the full observation space.- apply(obs, forward_operator, modeldata_mismatch, constant)[source]#
Apply the aggregation to the inverse problem components.
- __init__(by=None, level=None, freq=None, func='mean', blocks=None)[source]#
Initialize the ObsAggregator.
- Parameters:
by (str | list[str] | Callable, optional) – Explicit grouping specification. Mutually exclusive with
level.level (str, optional) – Index level to group / resample. Requires either a matching level name in the obs index or use alongside
freq.freq (str, optional) – Pandas offset alias for resampling
level(e.g.'D','h').func ({'mean', 'sum'}) – Aggregation function. Default
'mean'.blocks (str | list[str], optional) – Block name(s) to aggregate. Unlisted blocks pass through as-is.
- apply(obs, forward_operator, modeldata_mismatch, constant=None)[source]#
Apply W to the inverse problem components.
Inputs may be bare pandas objects or fips wrapper types (
Vector,ForwardOperator,CovarianceMatrix); return types mirror the inputs. See the class docstring for the mathematical transforms.The aggregator ensures all inputs are properly aligned to obs.index before building the weight matrix W.
- Parameters:
obs (pd.Series | Block | Vector) – Observation vector to be aggregated.
forward_operator (pd.DataFrame | MatrixBlock | Matrix) – Forward operator matrix to be aggregated.
modeldata_mismatch (pd.DataFrame | MatrixBlock | Matrix) – Model-data mismatch covariance matrix to be aggregated.
constant (float | pd.Series | Block | Vector | None, optional) – Optional constant offset vector to be aggregated. Scalars are invariant to aggregation. Default is None.
- Returns:
Aggregated (obs, forward_operator, modeldata_mismatch, constant) in the same types as the inputs.
- Return type: