Utilities

CalibrateEmulateSample.Utilities.CanonicalCorrelationType
struct CanonicalCorrelation{VV1, VV2, VV3, FT, VV4} <: CalibrateEmulateSample.Utilities.PairedDataContainerProcessor

Uses both input and output data to learn a subspace of maximal correlation between inputs and outputs. The subspace for a pair (X,Y) will be of size minimum(rank(X),rank(Y)), computed using SVD-based method e.g. See e.g., https://numerical.recipes/whp/notes/CanonCorrBySVD.pdf

Preferred construction is with the canonical_correlation method

Fields

  • data_mean::Any: storage for the input or output data mean

  • encoder_mat::Any: the encoding matrix of input or output canonical correlations

  • decoder_mat::Any: the decoding matrix of input or output canonical correlations

  • retain_var::Any: the fraction of variance to be retained after truncating singular values (1 implies no truncation)

  • apply_to::Any: Stores whether this is an input or output encoder (vector with string "in" or "out")

source
CalibrateEmulateSample.Utilities.DecorrelatorType
struct Decorrelator{VV1, VV2, VV3, FT, NT<:NamedTuple, AS<:AbstractString} <: CalibrateEmulateSample.Utilities.DataContainerProcessor

Decorrelate the data via taking an SVD decomposition and projecting onto the singular-vectors.

Preferred construction is with the methods

For decorrelate_structure_mat: The SVD is taken over a structure matrix (e.g., prior_cov for inputs, obs_noise_cov for outputs). The structure matrix will become exactly I after processing.

For decorrelate_sample_cov: The SVD is taken over the estimated covariance of the data. The data samples will have a Normal(0,I) distribution after processing,

For decorrelate(;decorrelate_with="combined") (default): The SVD is taken to be the sum of structure matrix and estimated covariance. This may be more robust to ill-specification of structure matrix, or poor estimation of the sample covariance.

Depending on the size of the matrix, we perform different options of SVD:

Small Matrix (dim < 3000): use LinearAlgebra.svd(Matrix) Large Matrix (dim > 3000): if retainvar = 1.0 use LowRankApprox.psvd(LinearMap; psvdkwargs...) if retain_var < 1.0 use TSVD.tsvd(LinearMap)

Fields

  • data_mean::Any: storage for the data mean

  • encoder_mat::Any: the matrix used to perform encoding

  • decoder_mat::Any: the inverse of the the matrix used to perform encoding

  • retain_var::Any: the fraction of variance to be retained after truncating singular values (1 implies no truncation)

  • n_totvar_samples::Int64: when retain_var < 1, number of samples to estimate the total variance. Larger values reduce the error in approximation at the cost of additional matrix-vector products.

  • max_rank::Int64: maximum dimension of subspace for retain_var < 1. The search may become expensive at large ranks, and therefore can be cut-off in this way

  • psvd_kwargs::NamedTuple: when retain_var = 1, the psvd algorithm from LowRankApprox.jl is used to decorrelate the space. here, kwargs can be passed in as a NamedTuple

  • decorrelate_with::AbstractString: Switch to choose what form of matrix to use to decorrelate the data

  • structure_mat_name::Union{Nothing, Symbol}: When given, use the structure matrix by this name if decorrelate_with uses structure matrices. When nothing, try to use the only present structure matrix instead.

source
CalibrateEmulateSample.Utilities.ElementwiseScalerType
struct ElementwiseScaler{T, VV<:(AbstractVector), VV2<:(AbstractVector), VV3<:(AbstractVector), VV4<:(AbstractVector), VV5<:(AbstractVector)} <: CalibrateEmulateSample.Utilities.DataContainerProcessor

The ElementwiseScaler{T} will create an encoding of the data_container via elementwise affine transformations.

Different methods T will build different transformations:

and are accessed with get_type

source
CalibrateEmulateSample.Utilities.LikelihoodInformedType
mutable struct LikelihoodInformed{VV1<:(AbstractVector), VV2<:(AbstractVector), VV3<:(AbstractVector), VV4<:(AbstractVector), FT<:Real} <: CalibrateEmulateSample.Utilities.PairedDataContainerProcessor

Uses both input and output data to learn a subspace that allows for a reduced posterior which is close to the full posterior.

Preferred construction is with the likelihood_informed method.

Fields

  • encoder_mat::AbstractVector

  • decoder_mat::AbstractVector

  • data_mean::AbstractVector

  • retain_info::Real

  • apply_to::Union{Nothing, AbstractString}

  • iters::AbstractVector

  • grad_type::Symbol

source
CalibrateEmulateSample.Utilities.NoiseInjectorType
struct NoiseInjector{MM1<:(AbstractMatrix), MM2<:(AbstractMatrix), MM3<:(AbstractMatrix), VV<:(AbstractVector), NorMM<:Union{Nothing, AbstractMatrix}, FT<:Real}

Structure used to store precomputed quantities for decode_and_add_noise(...), build with create_noise_injector(...)

  • K::AbstractMatrix: Gain Matrix from encoded to decoded space

  • enc_m::AbstractMatrix: encoded prior mean

  • m::AbstractMatrix: prior mean

  • L::Union{Nothing, AbstractMatrix}: cholesky factor of encoded prior covariance

  • scaling::Real: Scale the noise (may be needed (<1.0) for robustness if samples will be run in a physical model)

  • use_noise::Bool: whether to use the noise injection or not

  • encoder_schedule::AbstractVector: the encoding that was used to construct this object

source
CalibrateEmulateSample.Utilities.canonical_correlationMethod
canonical_correlation(
;
    retain_var
) -> CalibrateEmulateSample.Utilities.CanonicalCorrelation{Vector{Any}, Vector{Any}, Vector{Any}, Float64, Vector{AbstractString}}

Constructs the CanonicalCorrelation struct. Can optionally provide the keyword

  • retain_var[=1.0]: to project onto the leading singular vectors (of the input-output product) such that retain_var variance is retained.
source
CalibrateEmulateSample.Utilities.create_compact_linear_mapMethod
create_compact_linear_map(
    A;
    svd_dim_max,
    psvd_or_tsvd,
    tsvd_max_rank,
    psvd_kwargs
) -> LinearMaps.FunctionMap{Float64, CalibrateEmulateSample.Utilities.var"#14#20"{Vector{Any}, Vector{Any}, Vector{Any}, Vector{Any}, Vector{Any}}, CalibrateEmulateSample.Utilities.var"#16#22"{Vector{Any}, Vector{Any}, Vector{Any}, Vector{Any}, Vector{Any}}}

Produces a linear map of type LinearMap that can evaluates the stacked actions of the structure matrix in compact form. by calling say linear_map.f(x) or linear_map.fc(x) to obtain Ax, or A'x. This particular type can be used by packages like TSVD.jl or IterativeSolvers.jl for further computations.

This compact map constructs the following form of the Linear map f:

  1. get compact form svd-plus-d form "USVt + D" of the blocks
  2. create the f via stacking A.U * A.S * A.Vt * xblock + A.D * xblock for (A,xblock) in (As, x)

kwargs:

When computing the svd internally from an abstract matrix

  • svd_dim_max=3000: this switches to an approximate svd approach when applying to covariance matrices above dimension 3000
  • psvd_or_tsvd="psvd": use psvd or tsvd for approximating svd for large matrices
  • tsvd_max_rank=50: when using tsvd, what max rank to use. high rank = higher accuracy
  • psvd_kwargs=(; rtol=1e-2): when using psvd, what kwargs to pass. lower rtol = higher accuracy

Recommended: quick & inaccurate -> slow and more accurate

  • very large matrices - start with tsvd with very low rank, and increase
  • mid-size matrices - psvd with very high rtol, and decrease
source
CalibrateEmulateSample.Utilities.create_encoder_scheduleMethod
create_encoder_schedule(
    schedule_in::AbstractVector
) -> Vector{Any}

Create a flatter encoder schedule for the from the user's proposed schedule of the form:

enc_schedule = [
    (DataProcessor1(...), "in"), 
    (DataProcessor2(...), "out"), 
    (PairedDataProcessor3(...), "in"), 
    (DataProcessor4(...), "in_and_out"), 
]

This function creates the encoder scheduler that is also machine readable. E.g.,

enc_schedule = [
    (DataProcessor1(...), "in"), 
    (DataProcessor2(...), "out"), 
    (PairedDataProcessor3(...),"in"), 
    (DataProcessor4(...), "in"),
    (DataProcessor4(...), "out"), 
]

and the decoder schedule is a copy of the encoder schedule reversed (and processors copied)

source
CalibrateEmulateSample.Utilities.create_noise_injectorMethod
create_noise_injector(
    encoder_schedule::AbstractVector,
    prior::EnsembleKalmanProcesses.ParameterDistributions.ParameterDistribution,
    noise_injector_threshold::Real,
    noise_injector_scaling::Real
) -> Union{Nothing, CalibrateEmulateSample.Utilities.NoiseInjector}

Returns either a NoiseInjector object that stores precomputed quantities used in decode_and_add_noise(...), or returns nothing. The condition to return nothing:

  1. If the encoder is effectively lossless, as determined by it's variance loss not exceeding threshold noise_injector_threshold
  2. If the encoder_schedule is empty

One can additionally scale the injected samples with noise_injector_scaling

source
CalibrateEmulateSample.Utilities.decode_and_add_noiseMethod
decode_and_add_noise(
    encoder_schedule::AbstractVector,
    samples::AbstractMatrix,
    prior::EnsembleKalmanProcesses.ParameterDistributions.ParameterDistribution,
    noise_injector_threshold::Real,
    noise_injector_scaling::Real
) -> Any

Lift back the encoded samples into the full space. Similar to using decode_data, except that this additionally injects noise from the prior when the encoding is determined to be sufficiently lossy (total lost variance < keyword noise_injector_threshold). This is done in a way that preserves any known correlations between reduced and null-space directions, which is important for posterior reconstruction.

The quantification of correlation depends on Gaussian assumptions, and therefore is approximate.

source
CalibrateEmulateSample.Utilities.decode_dataMethod
decode_data(
    encoder_schedule::AbstractVector,
    data::Union{EnsembleKalmanProcesses.DataContainers.DataContainer, AbstractMatrix, AbstractVector},
    in_or_out::AbstractString
) -> Any

Decode the new data (a DataContainer, or matrix where data are columns, or vector viewed as one column) representing inputs ("in") or outputs ("out"), with the stored and initialized encoder schedule. Always internally calls CES.Utilities.decode_with_schedule

source
CalibrateEmulateSample.Utilities.decode_structure_matrixMethod
decode_structure_matrix(
    encoder_schedule::AbstractVector,
    structure_mat,
    in_or_out::AbstractString
) -> Any

Decode a new structure matrix in the input space ("in") or output space ("out"). with the stored and initialized encoder schedule. Always internally calls CES.Utilities.decode_with_schedule. If the structure matrix is a LinearMap, then the decoded structure matrix remains a LinearMap

source
CalibrateEmulateSample.Utilities.decode_with_scheduleMethod
decode_with_schedule(
    encoder_schedule::AbstractVector,
    data_container::EnsembleKalmanProcesses.DataContainers.DataContainer,
    in_or_out::AbstractString
) -> EnsembleKalmanProcesses.DataContainers.DataContainer

Takes in an already initialized encoder schedule, and decodes a DataContainer, the in_or_out string indicates if the data is input "in" or output "out" data (and thus decoded differently)

source
CalibrateEmulateSample.Utilities.decode_with_scheduleMethod
decode_with_schedule(
    encoder_schedule::AbstractVector,
    structure_matrix::Union{LinearAlgebra.UniformScaling, LinearMaps.LinearMap, AbstractMatrix, AbstractVector},
    in_or_out::AbstractString
) -> Any

Takes in an already initialized encoder schedule, and decodes a structure matrix, the in_or_out string indicates if the structure matrix is for input "in" or output "out" space (and thus decoded differently)

source
CalibrateEmulateSample.Utilities.decode_with_scheduleMethod
decode_with_schedule(
    encoder_schedule::AbstractVector,
    io_pairs::EnsembleKalmanProcesses.DataContainers.PairedDataContainer,
    input_structure_mat::Union{LinearAlgebra.UniformScaling, LinearMaps.LinearMap, AbstractMatrix, AbstractVector},
    output_structure_mat::Union{LinearAlgebra.UniformScaling, LinearMaps.LinearMap, AbstractMatrix, AbstractVector}
) -> Tuple{EnsembleKalmanProcesses.DataContainers.PairedDataContainer, Any, Any}

Takes in an already initialized encoder schedule, and decodes a DataContainer, and structure matrices with it, the in_or_out string indicates if the data is input "in" or output "out" data (and thus decoded differently)

source
CalibrateEmulateSample.Utilities.decorrelateMethod
decorrelate(
;
    retain_var,
    decorrelate_with,
    structure_mat_name,
    n_totvar_samples,
    max_rank,
    psvd_kwargs
) -> CalibrateEmulateSample.Utilities.Decorrelator{Vector{Any}, Vector{Any}, Vector{Any}, Float64, @NamedTuple{rtol::Float64}, String}

Constructs the Decorrelator struct. Users can add optional keyword arguments:

  • retain_var[=1.0]: to project onto the leading singular vectors such that retain_var variance is retained
  • decorrelate_with [="combined"]: from which matrix do we provide subspace directions, options are
  • n_totvar_samples[=500]: when retain_var < 1, number of samples to estimate the total variance for performing truncation.
  • max_rank[=100]: for retain_var < 1, the maximum dimension of subspace when using an the tsvd algorithm from TSVD.jl.
  • psvd_kwargs [= (; rtol = 1e-3)]: for retain_var = 1, the psvd algorithm from LowRankApprox.jl is used to decorrelate the space. kwargs can be passed in as a NamedTuple
source
CalibrateEmulateSample.Utilities.decorrelate_sample_covMethod
decorrelate_sample_cov(
;
    retain_var,
    n_totvar_samples,
    max_rank,
    psvd_kwargs
) -> CalibrateEmulateSample.Utilities.Decorrelator{Vector{Any}, Vector{Any}, Vector{Any}, Float64, @NamedTuple{rtol::Float64}, String}

Constructs the Decorrelator struct, setting decorrelatewith = "samplecov". Encoding data with this will ensure that the distribution of data samples after encoding will be Normal(0,I). One can additionally add keywords:

  • retain_var[=1.0]: to project onto the leading singular vectors such that retain_var variance is retained
  • n_totvar_samples[=500]: when retain_var < 1, number of samples to estimate the total variance for performing truncation.
  • max_rank[=100]: for retain_var < 1, the maximum dimension of subspace when using an the tsvd algorithm from TSVD.jl.
  • psvd_kwargs [= (; rtol = 1e-3)]: for retain_var = 1, the psvd algorithm from LowRankApprox.jl is used to decorrelate the space. kwargs can be passed in as a NamedTuple
source
CalibrateEmulateSample.Utilities.decorrelate_structure_matMethod
decorrelate_structure_mat(
;
    retain_var,
    structure_mat_name,
    n_totvar_samples,
    max_rank,
    psvd_kwargs
) -> CalibrateEmulateSample.Utilities.Decorrelator{Vector{Any}, Vector{Any}, Vector{Any}, Float64, @NamedTuple{rtol::Float64}, String}

Constructs the Decorrelator struct, setting decorrelatewith = "structuremat". This encoding will transform a provided structure matrix into I. One can additionally add keywords:

  • retain_var[=1.0]: to project onto the leading singular vectors such that retain_var variance is retained
  • n_totvar_samples[=500]: when retain_var < 1, number of samples to estimate the total variance for performing truncation.
  • max_rank[=100]: for retain_var < 1, the maximum dimension of subspace when using an the tsvd algorithm from TSVD.jl.
  • psvd_kwargs [= (; rtol = 1e-3)]: for retain_var = 1, the psvd algorithm from LowRankApprox.jl is used to decorrelate the space. kwargs can be passed in as a NamedTuple
source
CalibrateEmulateSample.Utilities.encode_dataMethod
encode_data(
    encoder_schedule::AbstractVector,
    data::Union{EnsembleKalmanProcesses.DataContainers.DataContainer, AbstractMatrix, AbstractVector},
    in_or_out::AbstractString
) -> Any

Encode the new data (a DataContainer, or matrix where data are columns, or vector viewed as one column) representing inputs ("in") or outputs ("out"), with the stored and initialized encoder schedule. Always internally calls CES.Utilities.encode_with_schedule

source
CalibrateEmulateSample.Utilities.encode_structure_matrixMethod
encode_structure_matrix(
    encoder_schedule::AbstractVector,
    structure_mat,
    in_or_out::AbstractString
) -> Any

Encode a new structure matrix in the input space ("in") or output space ("out"). with the stored and initialized encoder schedule. Always internally calls CES.Utilities.encode_with_schedule. If the structure matrix is a LinearMap, then the encoded structure matrix remains a LinearMap

source
CalibrateEmulateSample.Utilities.encode_with_scheduleMethod
encode_with_schedule(
    encoder_schedule::AbstractVector,
    data_container::EnsembleKalmanProcesses.DataContainers.DataContainer,
    in_or_out::AbstractString
) -> EnsembleKalmanProcesses.DataContainers.DataContainer

Takes in an already initialized encoder schedule, and encodes a DataContainer, the in_or_out string indicates if the data is input "in" or output "out" data (and thus encoded differently)

source
CalibrateEmulateSample.Utilities.encode_with_scheduleMethod
encode_with_schedule(
    encoder_schedule::AbstractVector,
    structure_matrix::Union{LinearAlgebra.UniformScaling, LinearMaps.LinearMap, AbstractMatrix, AbstractVector},
    in_or_out::AbstractString
) -> Any

Takes in an already initialized encoder schedule, and encodes a structure matrix, the in_or_out string indicates if the structure matrix is for input "in" or output "out" space (and thus encoded differently)

source
CalibrateEmulateSample.Utilities.encoder_kwargs_fromMethod
encoder_kwargs_from(
    obs::EnsembleKalmanProcesses.Observation
) -> NamedTuple{(:obs_noise_cov, :observation), <:Tuple{Any, Any}}

Extracts the relevant encoder kwargs from the observation as a NamedTuple. Contains,

  • :obs_noise_cov as (unbuilt) noise covariance
  • :observation as obs vector

Commonly called from encoder_kwargs_from(ekp, prior)

source
CalibrateEmulateSample.Utilities.encoder_kwargs_fromMethod
encoder_kwargs_from(
    os::EnsembleKalmanProcesses.ObservationSeries
) -> NamedTuple{(:obs_noise_cov, :observation), <:Tuple{Any, Any}}

Extracts the relevant encoder kwargs from the ObservationSeries as a NamedTuple. Assumes the same noise covariance for all observation vectors. Contains,

  • :obs_noise_cov as (unbuilt) noise covariance of FIRST observation
  • :observation as obs vector from all observations

Commonly called from encoder_kwargs_from(ekp, prior)

source
CalibrateEmulateSample.Utilities.encoder_kwargs_fromMethod
encoder_kwargs_from(
    prior::EnsembleKalmanProcesses.ParameterDistributions.ParameterDistribution
) -> NamedTuple{(:prior_cov,), <:Tuple{Any}}

Extracts the relevant encoder kwargs from the ParameterDistribution prior. Contains,

  • :prior_cov as prior covariance

Commonly called from encoder_kwargs_from(ekp, prior)

source
CalibrateEmulateSample.Utilities.encoder_kwargs_fromMethod
encoder_kwargs_from(
    ekp::EnsembleKalmanProcesses.EnsembleKalmanProcess,
    prior::EnsembleKalmanProcesses.ParameterDistributions.ParameterDistribution;
    observation_series,
    samples_in,
    samples_out,
    dt,
    final_samples_out
) -> NamedTuple

Extracts the relevant encoder kwargs from the ekp object, prior distribution. returned as a tuple that is passed to an Emulator or ForwardMapWrapper in the keyword argument encoder_kwargs. One can overload constructed kwargs by providing kwargs.

kwargs:

  • Common overloaded kwarg: final_samples_out. As ekp stores one more input than output, by default we truncate to the penultimate ekp iteration (where input output pairs exist). However, one can provide an additional final output paired with g=forward_map_ensemble(get_ϕ_final(ekp)) with final_samples_out=g

  • Other overloading kwargs: observation_series,samples_in,samples_out,dt

source
CalibrateEmulateSample.Utilities.encoder_kwargs_fromMethod
encoder_kwargs_from(
    samples_in::AbstractVector,
    samples_out::AbstractVector,
    dt::AbstractVector
) -> NamedTuple{(:input_structure_vecs, :output_structure_vecs), <:Tuple{Dict, Dict}}

Extracts the relevant encoder kwargs from a vector triple (samplesin, samplesout, dt). Samples describe an ordered sequence of distributions in input and output space, each indexed with a temperature, or algorithm time, dt.

Contains

  • :input_structure_vecs: Dict with fields :dt (Vec{Float}), :samples_in (Vec{Matrix})
  • :output_structure_vecs: Dict with fields :dt (Vec{Float}), :samples_out (Vec{Matrix})

Commonly called from encoder_kwargs_from(ekp, prior)

source
CalibrateEmulateSample.Utilities.get_decoder_from_scheduleMethod
get_decoder_from_schedule(
    encoder_schedule::AbstractVector
) -> Dict

Affine decodings can be represented as Dx + b. This function returns D,b for the input and output encoders in a Dict indexed by "in" and "out". D will be represented as a LinearMap object (can apply D = Matrix(D) to rebuild).

source
CalibrateEmulateSample.Utilities.get_decoder_from_scheduleMethod
get_decoder_from_schedule(encoder_schedule::AbstractString)

Affine decodings can be represented as Dx + b. This function returns D,b. D will be represented as a LinearMap object (can apply D = Matrix(D) to rebuild).

  • in_or_out: should be either "in" or "out", to retrieve either the input or output encoder
source
CalibrateEmulateSample.Utilities.get_encoded_dimMethod
get_encoded_dim(encoder_schedule::AbstractVector) -> Dict

gets the dimension of the encoded space, returned as a Dict with keys "in","out". Provides nothing values if encoder schedule is empty or uninitialized

source
CalibrateEmulateSample.Utilities.get_encoded_dimMethod
get_encoded_dim(encoder_schedule::AbstractString)

gets the dimension of the encoded space, for input (providing "in"), or output (providing "out"), provides nothing if encoder schedule is empty or uninitialized

source
CalibrateEmulateSample.Utilities.get_encoder_from_scheduleMethod
get_encoder_from_schedule(
    encoder_schedule::AbstractVector
) -> Dict

Affine encodings can be represented as Ex + b. This function returns E,b for the input and output encoders in a Dict indexed by "in" and "out". E will be represented as a LinearMap object (can apply E = Matrix(E) to rebuild).

source
CalibrateEmulateSample.Utilities.get_encoder_from_scheduleMethod
get_encoder_from_schedule(encoder_schedule::AbstractString)

Affine encodings can be represented as Ex + b. This function returns E,b. E will be represented as a LinearMap object (can apply E = Matrix(E) to rebuild).

  • in_or_out: should be either "in" or "out", to retrieve either the input or output encoder
source
CalibrateEmulateSample.Utilities.get_training_pointsMethod
get_training_points(
    ekp::EnsembleKalmanProcesses.EnsembleKalmanProcess{FT, IT, P},
    train_iterations::Union{AbstractVector{IT}, IT} where IT;
    g_final
) -> EnsembleKalmanProcesses.DataContainers.PairedDataContainer

Extract and flatten the training points needed to train an Emulator. returned as a PairedDataContainer

  • ekp - EnsembleKalmanProcess holding the parameters and the data that were produced during the Ensemble Kalman (EK) process.
  • train_iterations - Number (e.g. 1:trainiterations), or indices trainiterations=3:2:9, for EKP iterations to extract.
  • g_final[=nothing] - EKP will typically store one extra input data iteration. If desired, the user can add output data for this final iteration directly with g_final. It should be of type <: AbstractMatrix, sized consistently as with return values from get_g(ekp,1).
source
CalibrateEmulateSample.Utilities.initialize_and_encode_with_schedule!Method
initialize_and_encode_with_schedule!(
    encoder_schedule::AbstractVector,
    io_pairs::EnsembleKalmanProcesses.DataContainers.PairedDataContainer;
    input_structure_mats,
    output_structure_mats,
    input_structure_vecs,
    output_structure_vecs,
    prior_cov,
    obs_noise_cov,
    observation,
    samples_in,
    samples_out
) -> Tuple{EnsembleKalmanProcesses.DataContainers.PairedDataContainer, Dict{Symbol, Union{LinearAlgebra.UniformScaling, LinearMaps.LinearMap, AbstractMatrix, AbstractVector}}, Dict{Symbol, Union{LinearAlgebra.UniformScaling, LinearMaps.LinearMap, AbstractMatrix, AbstractVector}}, Dict{Symbol, Union{AbstractMatrix, AbstractVector}}, Dict{Symbol, Union{AbstractMatrix, AbstractVector}}}

Takes in the created encoder schedule (See create_encoder_schedule), and initializes it, and encodes the paired data container, and structure matrices with it.

source
CalibrateEmulateSample.Utilities.initialize_processor!Method
initialize_processor!(
    cc::CalibrateEmulateSample.Utilities.CanonicalCorrelation,
    in_data::AbstractMatrix,
    out_data::AbstractMatrix,
    input_structure_matrices,
    output_structure_matrices,
    input_structure_vectors,
    output_structure_vectors,
    apply_to::AbstractString
) -> Any

Computes and populates the data_mean, encoder_mat, decoder_mat and apply_to fields for the CanonicalCorrelation

source
CalibrateEmulateSample.Utilities.initialize_processor!Method
initialize_processor!(
    dd::CalibrateEmulateSample.Utilities.Decorrelator,
    data::AbstractMatrix,
    structure_matrices::Dict{Symbol, SM<:Union{LinearAlgebra.UniformScaling, LinearMaps.LinearMap, AbstractMatrix, AbstractVector}},
    _::Dict{Symbol, SV<:Union{AbstractMatrix, AbstractVector}}
) -> Any

Computes and populates the data_mean and encoder_mat and decoder_mat fields for the Decorrelator

source
CalibrateEmulateSample.Utilities.isequal_linearMethod
isequal_linear(
    A::LinearMaps.LinearMap,
    B::LinearMaps.LinearMap;
    tol,
    n_eval,
    rng,
    up_to_sign
) -> Bool

Tests equality for a LinearMap on a standard basis of the input space. Note that this operation requires a matrix multiply per input dimension so can be expensive.

Kwargs:

  • neval (=nothing): the number of basis vectors to compare against (randomly selected without replacement if `neval < size(A,1)`)
  • tol (=2*eps()): the tolerance for equality on evaluation per entry
  • rng (=defaultrng()): When provided, and `neval < size(A,1); a random subset of the basis is compared, using thisrng`.
  • uptosign(=false): Only assess equality up to a sign-error (sufficient for e.g. encoder/decoder matrices)
source
CalibrateEmulateSample.Utilities.likelihood_informedMethod
likelihood_informed(
;
    retain_info,
    iters,
    grad_type
) -> CalibrateEmulateSample.Utilities.LikelihoodInformed{Vector{Any}, Vector{Any}, Vector{Any}, Vector{Int64}, Int64}

Constructs the LikelihoodInformed struct. Keywords:

  • retain_info: the method will attempt to limit the KL divergence of the true posterior from the reduced posterior to a value proportional to (1 - retain_info). Choose retain_info close to 1 to get a good approximation in a large subspace, and reduce it to get a worse approximation in a smaller subspace.
  • iters[= [1]]: the likelihood-informed data processor requires samples from the distribution ∝ π_prior(x) π_likelihood(y | x)^α with α ∈ [0, 1]. Here, iter indicates the structure vector iterations to use, as sampled from these distributions. For how to pass in these samples, see the use_data_as_samples parameter.
  • grad_type[= :linreg]: how the gradient of the forward model at the samples will be approximated. Choose from :linreg (global linear regression) and :localsl (localized statistical linearization; see [Wacker, 2025]).
source
CalibrateEmulateSample.Utilities.minmax_scaleMethod
minmax_scale(

) -> CalibrateEmulateSample.Utilities.ElementwiseScaler{CalibrateEmulateSample.Utilities.MinMaxScaling, Vector{Float64}, Vector, Vector, Vector, Vector}

Constructs ElementwiseScaler{MinMaxScaling} processor. As part of an encoder schedule, this will apply the transform $\frac{x - \min(x)}{\max(x) - \min(x)}$ to each data dimension.

source
CalibrateEmulateSample.Utilities.norm_linear_mapMethod
norm_linear_map(A::LinearMaps.LinearMap; ...) -> Any
norm_linear_map(
    A::LinearMaps.LinearMap,
    p::Real;
    n_eval,
    rng
) -> Any

Approximately computes the norm of a LinearMap object. For Amap associated with matrix A, norm_linear_map(Amap,p)≈norm(A,p). Can be aliased as norm()

kwargs

  • n_eval(=nothing): number of mat-vec products to apply in the approximation (larger is more accurate). default performs size(map,2) products
  • rng(=Random.default_rng()): random number generator
source
CalibrateEmulateSample.Utilities.quartile_scaleMethod
quartile_scale(

) -> CalibrateEmulateSample.Utilities.ElementwiseScaler{CalibrateEmulateSample.Utilities.QuartileScaling, Vector{Float64}, Vector, Vector, Vector, Vector}

Constructs ElementwiseScaler{QuartileScaling} processor. As part of an encoder schedule, it will apply the transform $\frac{x - Q2(x)}{Q3(x) - Q1(x)}$ to each data dimension. Also known as "robust scaling"

source
CalibrateEmulateSample.Utilities.zscore_scaleMethod
zscore_scale(

) -> CalibrateEmulateSample.Utilities.ElementwiseScaler{CalibrateEmulateSample.Utilities.ZScoreScaling, Vector{Float64}, Vector, Vector, Vector, Vector}

Constructs ElementwiseScaler{ZScoreScaling} processor. As part of an encoder schedule, this will apply the transform $\frac{x-\mu}{\sigma}$, (where $x\sim N(\mu,\sigma)$), to each data dimension. For multivariate standardization, see Decorrelator

source