Utilities

CalibrateEmulateSample.Utilities.CanonicalCorrelationType
struct CanonicalCorrelation{VV1, VV2, VV3, FT, VV4} <: CalibrateEmulateSample.Utilities.PairedDataContainerProcessor

Uses both input and output data to learn a subspace of maximal correlation between inputs and outputs. The subspace for a pair (X,Y) will be of size minimum(rank(X),rank(Y)), computed using SVD-based method e.g. See e.g., https://numerical.recipes/whp/notes/CanonCorrBySVD.pdf

Preferred construction is with the canonical_correlation method

Fields

  • data_mean::Any: storage for the input or output data mean

  • encoder_mat::Any: the encoding matrix of input or output canonical correlations

  • decoder_mat::Any: the decoding matrix of input or output canonical correlations

  • retain_var::Any: the fraction of variance to be retained after truncating singular values (1 implies no truncation)

  • apply_to::Any: Stores whether this is an input or output encoder (vector with string "in" or "out")

source
CalibrateEmulateSample.Utilities.DecorrelatorType
struct Decorrelator{VV1, VV2, VV3, FT, NT<:NamedTuple, AS<:AbstractString} <: CalibrateEmulateSample.Utilities.DataContainerProcessor

Decorrelate the data via taking an SVD decomposition and projecting onto the singular-vectors.

Preferred construction is with the methods

For decorrelate_structure_mat: The SVD is taken over a structure matrix (e.g., prior_cov for inputs, obs_noise_cov for outputs). The structure matrix will become exactly I after processing.

For decorrelate_sample_cov: The SVD is taken over the estimated covariance of the data. The data samples will have a Normal(0,I) distribution after processing,

For decorrelate(;decorrelate_with="combined") (default): The SVD is taken to be the sum of structure matrix and estimated covariance. This may be more robust to ill-specification of structure matrix, or poor estimation of the sample covariance.

Depending on the size of the matrix, we perform different options of SVD:

Small Matrix (dim < 3000): use LinearAlgebra.svd(Matrix) Large Matrix (dim > 3000): if retainvar = 1.0 use LowRankApprox.psvd(LinearMap; psvdkwargs...) if retain_var < 1.0 use TSVD.tsvd(LinearMap)

Fields

  • data_mean::Any: storage for the data mean

  • encoder_mat::Any: the matrix used to perform encoding

  • decoder_mat::Any: the inverse of the the matrix used to perform encoding

  • retain_var::Any: the fraction of variance to be retained after truncating singular values (1 implies no truncation)

  • n_totvar_samples::Int64: when retain_var < 1, number of samples to estimate the total variance. Larger values reduce the error in approximation at the cost of additional matrix-vector products.

  • max_rank::Int64: maximum dimension of subspace for retain_var < 1. The search may become expensive at large ranks, and therefore can be cut-off in this way

  • psvd_kwargs::NamedTuple: when retain_var = 1, the psvd algorithm from LowRankApprox.jl is used to decorrelate the space. here, kwargs can be passed in as a NamedTuple

  • decorrelate_with::AbstractString: Switch to choose what form of matrix to use to decorrelate the data

  • structure_mat_name::Union{Nothing, Symbol}: When given, use the structure matrix by this name if decorrelate_with uses structure matrices. When nothing, try to use the only present structure matrix instead.

source
CalibrateEmulateSample.Utilities.ElementwiseScalerType
struct ElementwiseScaler{T, VV<:(AbstractVector), VV2<:(AbstractVector), VV3<:(AbstractVector), VV4<:(AbstractVector), VV5<:(AbstractVector)} <: CalibrateEmulateSample.Utilities.DataContainerProcessor

The ElementwiseScaler{T} will create an encoding of the data_container via elementwise affine transformations.

Different methods T will build different transformations:

and are accessed with get_type

source
CalibrateEmulateSample.Utilities.canonical_correlationMethod
canonical_correlation(
;
    retain_var
) -> CalibrateEmulateSample.Utilities.CanonicalCorrelation{Vector{Any}, Vector{Any}, Vector{Any}, Float64, Vector{AbstractString}}

Constructs the CanonicalCorrelation struct. Can optionally provide the keyword

  • retain_var[=1.0]: to project onto the leading singular vectors (of the input-output product) such that retain_var variance is retained.
source
CalibrateEmulateSample.Utilities.create_compact_linear_mapMethod
create_compact_linear_map(
    A;
    svd_dim_max,
    psvd_or_tsvd,
    tsvd_max_rank,
    psvd_kwargs
) -> LinearMaps.FunctionMap{Float64, CalibrateEmulateSample.Utilities.var"#6#12"{Vector{Any}, Vector{Any}, Vector{Any}, Vector{Any}, Vector{Any}}, CalibrateEmulateSample.Utilities.var"#8#14"{Vector{Any}, Vector{Any}, Vector{Any}, Vector{Any}, Vector{Any}}}

Produces a linear map of type LinearMap that can evaluates the stacked actions of the structure matrix in compact form. by calling say linear_map.f(x) or linear_map.fc(x) to obtain Ax, or A'x. This particular type can be used by packages like TSVD.jl or IterativeSolvers.jl for further computations.

This compact map constructs the following form of the Linear map f:

  1. get compact form svd-plus-d form "USVt + D" of the blocks
  2. create the f via stacking A.U * A.S * A.Vt * xblock + A.D * xblock for (A,xblock) in (As, x)

kwargs:

When computing the svd internally from an abstract matrix

  • svd_dim_max=3000: this switches to an approximate svd approach when applying to covariance matrices above dimension 3000
  • psvd_or_tsvd="psvd": use psvd or tsvd for approximating svd for large matrices
  • tsvd_max_rank=50: when using tsvd, what max rank to use. high rank = higher accuracy
  • psvd_kwargs=(; rtol=1e-2): when using psvd, what kwargs to pass. lower rtol = higher accuracy

Recommended: quick & inaccurate -> slow and more accurate

  • very large matrices - start with tsvd with very low rank, and increase
  • mid-size matrices - psvd with very high rtol, and decrease
source
CalibrateEmulateSample.Utilities.create_encoder_scheduleMethod
create_encoder_schedule(
    schedule_in::AbstractVector
) -> Vector{Any}

Create a flatter encoder schedule for the from the user's proposed schedule of the form:

enc_schedule = [
    (DataProcessor1(...), "in"), 
    (DataProcessor2(...), "out"), 
    (PairedDataProcessor3(...), "in"), 
    (DataProcessor4(...), "in_and_out"), 
]

This function creates the encoder scheduler that is also machine readable. E.g.,

enc_schedule = [
    (DataProcessor1(...), "in"), 
    (DataProcessor2(...), "out"), 
    (PairedDataProcessor3(...),"in"), 
    (DataProcessor4(...), "in"),
    (DataProcessor4(...), "out"), 
]

and the decoder schedule is a copy of the encoder schedule reversed (and processors copied)

source
CalibrateEmulateSample.Utilities.decode_structure_matrixMethod
decode_structure_matrix(
    cc::CalibrateEmulateSample.Utilities.CanonicalCorrelation,
    enc_structure_matrix::Union{LinearAlgebra.UniformScaling, LinearMaps.LinearMap, AbstractMatrix, AbstractVector}
) -> Any

Apply the CanonicalCorrelation decoder to a provided structure matrix

source
CalibrateEmulateSample.Utilities.decode_structure_matrixMethod
decode_structure_matrix(
    dd::CalibrateEmulateSample.Utilities.Decorrelator,
    enc_structure_matrix::Union{LinearAlgebra.UniformScaling, LinearMaps.LinearMap, AbstractMatrix, AbstractVector}
) -> Any

Apply the Decorrelator decoder to a provided structure matrix. If the structure matrix is a LinearMap, then the encoded structure matrix remains a LinearMap.

source
CalibrateEmulateSample.Utilities.decode_structure_matrixMethod
decode_structure_matrix(
    es::CalibrateEmulateSample.Utilities.ElementwiseScaler,
    enc_structure_matrix::Union{LinearAlgebra.UniformScaling, LinearMaps.LinearMap, AbstractMatrix, AbstractVector}
) -> Any

Apply the ElementwiseScaler decoder to a provided structure matrix. If the structure matrix is a LinearMap, then the encoded structure matrix remains a LinearMap.

source
CalibrateEmulateSample.Utilities.decode_with_scheduleMethod
decode_with_schedule(
    encoder_schedule::AbstractVector,
    data_container::EnsembleKalmanProcesses.DataContainers.DataContainer,
    in_or_out::AbstractString
) -> EnsembleKalmanProcesses.DataContainers.DataContainer

Takes in an already initialized encoder schedule, and decodes a DataContainer, the in_or_out string indicates if the data is input "in" or output "out" data (and thus decoded differently)

source
CalibrateEmulateSample.Utilities.decode_with_scheduleMethod
decode_with_schedule(
    encoder_schedule::AbstractVector,
    structure_matrix::Union{LinearAlgebra.UniformScaling, LinearMaps.LinearMap, AbstractMatrix, AbstractVector},
    in_or_out::AbstractString
) -> Any

Takes in an already initialized encoder schedule, and decodes a structure matrix, the in_or_out string indicates if the structure matrix is for input "in" or output "out" space (and thus decoded differently)

source
CalibrateEmulateSample.Utilities.decode_with_scheduleMethod
decode_with_schedule(
    encoder_schedule::AbstractVector,
    io_pairs::EnsembleKalmanProcesses.DataContainers.PairedDataContainer,
    input_structure_mat::Union{LinearAlgebra.UniformScaling, LinearMaps.LinearMap, AbstractMatrix, AbstractVector},
    output_structure_mat::Union{LinearAlgebra.UniformScaling, LinearMaps.LinearMap, AbstractMatrix, AbstractVector}
) -> Tuple{EnsembleKalmanProcesses.DataContainers.PairedDataContainer, Any, Any}

Takes in an already initialized encoder schedule, and decodes a DataContainer, and structure matrices with it, the in_or_out string indicates if the data is input "in" or output "out" data (and thus decoded differently)

source
CalibrateEmulateSample.Utilities.decorrelateMethod
decorrelate(
;
    retain_var,
    decorrelate_with,
    structure_mat_name,
    n_totvar_samples,
    max_rank,
    psvd_kwargs
) -> CalibrateEmulateSample.Utilities.Decorrelator{Vector{Any}, Vector{Any}, Vector{Any}, Float64, @NamedTuple{rtol::Float64}, String}

Constructs the Decorrelator struct. Users can add optional keyword arguments:

  • retain_var[=1.0]: to project onto the leading singular vectors such that retain_var variance is retained
  • decorrelate_with [="combined"]: from which matrix do we provide subspace directions, options are
source
CalibrateEmulateSample.Utilities.decorrelate_sample_covMethod
decorrelate_sample_cov(
;
    retain_var,
    n_totvar_samples,
    max_rank,
    psvd_kwargs
) -> CalibrateEmulateSample.Utilities.Decorrelator{Vector{Any}, Vector{Any}, Vector{Any}, Float64, @NamedTuple{rtol::Float64}, String}

Constructs the Decorrelator struct, setting decorrelatewith = "samplecov". Encoding data with this will ensure that the distribution of data samples after encoding will be Normal(0,I). One can additionally add keywords:

  • retain_var[=1.0]: to project onto the leading singular vectors such that retain_var variance is retained
source
CalibrateEmulateSample.Utilities.decorrelate_structure_matMethod
decorrelate_structure_mat(
;
    retain_var,
    structure_mat_name,
    n_totvar_samples,
    max_rank,
    psvd_kwargs
) -> CalibrateEmulateSample.Utilities.Decorrelator{Vector{Any}, Vector{Any}, Vector{Any}, Float64, @NamedTuple{rtol::Float64}, String}

Constructs the Decorrelator struct, setting decorrelatewith = "structuremat". This encoding will transform a provided structure matrix into I. One can additionally add keywords:

  • retain_var[=1.0]: to project onto the leading singular vectors such that retain_var variance is retained
source
CalibrateEmulateSample.Utilities.encode_structure_matrixMethod
encode_structure_matrix(
    cc::CalibrateEmulateSample.Utilities.CanonicalCorrelation,
    structure_matrix::Union{LinearAlgebra.UniformScaling, LinearMaps.LinearMap, AbstractMatrix, AbstractVector}
) -> Any

Apply the CanonicalCorrelation encoder to a provided structure matrix

source
CalibrateEmulateSample.Utilities.encode_structure_matrixMethod
encode_structure_matrix(
    dd::CalibrateEmulateSample.Utilities.Decorrelator,
    structure_matrix::Union{LinearAlgebra.UniformScaling, LinearMaps.LinearMap, AbstractMatrix, AbstractVector}
) -> Any

Apply the Decorrelator encoder to a provided structure matrix. If the structure matrix is a LinearMap, then the encoded structure matrix remains a LinearMap.

source
CalibrateEmulateSample.Utilities.encode_structure_matrixMethod
encode_structure_matrix(
    es::CalibrateEmulateSample.Utilities.ElementwiseScaler,
    structure_matrix::Union{LinearAlgebra.UniformScaling, LinearMaps.LinearMap, AbstractMatrix, AbstractVector}
) -> Any

Apply the ElementwiseScaler encoder to a provided structure matrix. If the structure matrix is a LinearMap, then the encoded structure matrix remains a LinearMap.

source
CalibrateEmulateSample.Utilities.encode_with_scheduleMethod
encode_with_schedule(
    encoder_schedule::AbstractVector,
    data_container::EnsembleKalmanProcesses.DataContainers.DataContainer,
    in_or_out::AbstractString
) -> EnsembleKalmanProcesses.DataContainers.DataContainer

Takes in an already initialized encoder schedule, and encodes a DataContainer, the in_or_out string indicates if the data is input "in" or output "out" data (and thus encoded differently)

source
CalibrateEmulateSample.Utilities.encode_with_scheduleMethod
encode_with_schedule(
    encoder_schedule::AbstractVector,
    structure_matrix::Union{LinearAlgebra.UniformScaling, LinearMaps.LinearMap, AbstractMatrix, AbstractVector},
    in_or_out::AbstractString
) -> Any

Takes in an already initialized encoder schedule, and encodes a structure matrix, the in_or_out string indicates if the structure matrix is for input "in" or output "out" space (and thus encoded differently)

source
CalibrateEmulateSample.Utilities.encoder_kwargs_fromMethod
encoder_kwargs_from(
    obs::EnsembleKalmanProcesses.Observation
) -> NamedTuple{(:obs_noise_cov, :observation), <:Tuple{Any, Any}}

Extracts the relevant encoder kwargs from the observation as a NamedTuple. Contains,

  • :obs_noise_cov as (unbuilt) noise covariance
  • :observation as obs vector
source
CalibrateEmulateSample.Utilities.encoder_kwargs_fromMethod
encoder_kwargs_from(
    os::EnsembleKalmanProcesses.ObservationSeries
) -> NamedTuple{(:obs_noise_cov, :observation), <:Tuple{Any, Any}}

Extracts the relevant encoder kwargs from the ObservationSeries as a NamedTuple. Assumes the same noise covariance for all observation vectors. Contains,

  • :obs_noise_cov as (unbuilt) noise covariance of FIRST observation
  • :observation as obs vector from all observations
source
CalibrateEmulateSample.Utilities.encoder_kwargs_fromMethod
encoder_kwargs_from(
    prior::EnsembleKalmanProcesses.ParameterDistributions.ParameterDistribution
) -> NamedTuple{(:prior_cov,), <:Tuple{Any}}

Extracts the relevant encoder kwargs from the ParameterDistribution prior. Contains,

  • :prior_cov as prior covariance
source
CalibrateEmulateSample.Utilities.get_training_pointsMethod
get_training_points(
    ekp::EnsembleKalmanProcesses.EnsembleKalmanProcess{FT, IT, P},
    train_iterations::Union{AbstractVector{IT}, IT} where IT
) -> EnsembleKalmanProcesses.DataContainers.PairedDataContainer

Extract the training points needed to train the Gaussian process regression.

  • ekp - EnsembleKalmanProcess holding the parameters and the data that were produced during the Ensemble Kalman (EK) process.
  • train_iterations - Number (or indices) EK layers/iterations to train on.
source
CalibrateEmulateSample.Utilities.initialize_and_encode_with_schedule!Method
initialize_and_encode_with_schedule!(
    encoder_schedule::AbstractVector,
    io_pairs::EnsembleKalmanProcesses.DataContainers.PairedDataContainer;
    input_structure_mats,
    output_structure_mats,
    input_structure_vecs,
    output_structure_vecs,
    prior_cov,
    obs_noise_cov,
    observation,
    prior_samples_in,
    prior_samples_out
) -> Tuple{EnsembleKalmanProcesses.DataContainers.PairedDataContainer, Dict{Symbol, Union{LinearAlgebra.UniformScaling, LinearMaps.LinearMap, AbstractMatrix, AbstractVector}}, Dict{Symbol, Union{LinearAlgebra.UniformScaling, LinearMaps.LinearMap, AbstractMatrix, AbstractVector}}, Dict{Symbol, Union{AbstractMatrix, AbstractVector}}, Dict{Symbol, Union{AbstractMatrix, AbstractVector}}}

Takes in the created encoder schedule (See create_encoder_schedule), and initializes it, and encodes the paired data container, and structure matrices with it.

source
CalibrateEmulateSample.Utilities.initialize_processor!Method
initialize_processor!(
    cc::CalibrateEmulateSample.Utilities.CanonicalCorrelation,
    in_data::AbstractMatrix,
    out_data::AbstractMatrix,
    input_structure_matrices,
    output_structure_matrices,
    input_structure_vectors,
    output_structure_vectors,
    apply_to::AbstractString
) -> Any

Computes and populates the data_mean, encoder_mat, decoder_mat and apply_to fields for the CanonicalCorrelation

source
CalibrateEmulateSample.Utilities.initialize_processor!Method
initialize_processor!(
    dd::CalibrateEmulateSample.Utilities.Decorrelator,
    data::AbstractMatrix,
    structure_matrices::Dict{Symbol, SM<:Union{LinearAlgebra.UniformScaling, LinearMaps.LinearMap, AbstractMatrix, AbstractVector}},
    _::Dict{Symbol, SV<:Union{AbstractMatrix, AbstractVector}}
) -> Any

Computes and populates the data_mean and encoder_mat and decoder_mat fields for the Decorrelator

source
CalibrateEmulateSample.Utilities.isequal_linearMethod
isequal_linear(
    A::LinearMaps.LinearMap,
    B::LinearMaps.LinearMap;
    tol,
    n_eval,
    rng,
    up_to_sign
) -> Bool

Tests equality for a LinearMap on a standard basis of the input space. Note that this operation requires a matrix multiply per input dimension so can be expensive.

Kwargs:

  • neval (=nothing): the number of basis vectors to compare against (randomly selected without replacement if `neval < size(A,1)`)
  • tol (=2*eps()): the tolerance for equality on evaluation per entry
  • rng (=defaultrng()): When provided, and `neval < size(A,1); a random subset of the basis is compared, using thisrng`.
  • uptosign(=false): Only assess equality up to a sign-error (sufficient for e.g. encoder/decoder matrices)
source
CalibrateEmulateSample.Utilities.minmax_scaleMethod
minmax_scale(

) -> CalibrateEmulateSample.Utilities.ElementwiseScaler{CalibrateEmulateSample.Utilities.MinMaxScaling, Vector{Float64}, Vector, Vector, Vector, Vector}

Constructs ElementwiseScaler{MinMaxScaling} processor. As part of an encoder schedule, this will apply the transform $\frac{x - \min(x)}{\max(x) - \min(x)}$ to each data dimension.

source
CalibrateEmulateSample.Utilities.norm_linear_mapMethod
norm_linear_map(A::LinearMaps.LinearMap; ...) -> Any
norm_linear_map(
    A::LinearMaps.LinearMap,
    p::Real;
    n_eval,
    rng
) -> Any

Approximately computes the norm of a LinearMap object. For Amap associated with matrix A, norm_linear_map(Amap,p)≈norm(A,p). Can be aliased as norm()

kwargs

  • n_eval(=nothing): number of mat-vec products to apply in the approximation (larger is more accurate). default performs size(map,2) products
  • rng(=Random.default_rng()): random number generator
source
CalibrateEmulateSample.Utilities.quartile_scaleMethod
quartile_scale(

) -> CalibrateEmulateSample.Utilities.ElementwiseScaler{CalibrateEmulateSample.Utilities.QuartileScaling, Vector{Float64}, Vector, Vector, Vector, Vector}

Constructs ElementwiseScaler{QuartileScaling} processor. As part of an encoder schedule, it will apply the transform $\frac{x - Q2(x)}{Q3(x) - Q1(x)}$ to each data dimension. Also known as "robust scaling"

source
CalibrateEmulateSample.Utilities.zscore_scaleMethod
zscore_scale(

) -> CalibrateEmulateSample.Utilities.ElementwiseScaler{CalibrateEmulateSample.Utilities.ZScoreScaling, Vector{Float64}, Vector, Vector, Vector, Vector}

Constructs ElementwiseScaler{ZScoreScaling} processor. As part of an encoder schedule, this will apply the transform $\frac{x-\mu}{\sigma}$, (where $x\sim N(\mu,\sigma)$), to each data dimension. For multivariate standardization, see Decorrelator

source