Utilities

CalibrateEmulateSample.Utilities.CanonicalCorrelationType
struct CanonicalCorrelation{VV1, VV2, VV3, FT, VV4} <: CalibrateEmulateSample.Utilities.PairedDataContainerProcessor

Uses both input and output data to learn a subspace of maximal correlation between inputs and outputs. The subspace for a pair (X,Y) will be of size minimum(rank(X),rank(Y)), computed using SVD-based method e.g. See e.g., https://numerical.recipes/whp/notes/CanonCorrBySVD.pdf

Preferred construction is with the canonical_correlation method

Fields

  • data_mean::Any: storage for the input or output data mean

  • encoder_mat::Any: the encoding matrix of input or output canonical correlations

  • decoder_mat::Any: the decoding matrix of input or output canonical correlations

  • retain_var::Any: the fraction of variance to be retained after truncating singular values (1 implies no truncation)

  • apply_to::Any: Stores whether this is an input or output encoder (vector with string "in" or "out")

source
CalibrateEmulateSample.Utilities.DecorrelaterType
struct Decorrelater{VV1, VV2, VV3, FT, AS<:AbstractString} <: CalibrateEmulateSample.Utilities.DataContainerProcessor

Decorrelate the data via taking an SVD decomposition and projecting onto the singular-vectors.

Preferred construction is with the methods

For decorrelate_structure_mat: The SVD is taken over a structure matrix (e.g., prior_cov for inputs, obs_noise_cov for outputs). The structure matrix will become exactly I after processing.

For decorrelate_sample_cov: The SVD is taken over the estimated covariance of the data. The data samples will have a Normal(0,I) distribution after processing,

For decorrelate(;decorrelate_with="combined") (default): The SVD is taken to be the sum of structure matrix and estimated covariance. This may be more robust to ill-specification of structure matrix, or poor estimation of the sample covariance.

Fields

  • data_mean::Any: storage for the data mean

  • encoder_mat::Any: the matrix used to perform encoding

  • decoder_mat::Any: the inverse of the the matrix used to perform encoding

  • retain_var::Any: the fraction of variance to be retained after truncating singular values (1 implies no truncation)

  • decorrelate_with::AbstractString: Switch to choose what form of matrix to use to decorrelate the data

source
CalibrateEmulateSample.Utilities.canonical_correlationMethod
canonical_correlation(
;
    retain_var
) -> CalibrateEmulateSample.Utilities.CanonicalCorrelation{Vector{Any}, Vector{Any}, Vector{Any}, Float64, Vector{AbstractString}}

Constructs the CanonicalCorrelation struct. Can optionally provide the keyword

  • retain_var[=1.0]: to project onto the leading singular vectors (of the input-output product) such that retain_var variance is retained.
source
CalibrateEmulateSample.Utilities.create_encoder_scheduleMethod
create_encoder_schedule(
    schedule_in::AbstractVector
) -> Vector{Any}

Create a flatter encoder schedule for the from the user's proposed schedule of the form:

enc_schedule = [
    (DataProcessor1(...), "in"), 
    (DataProcessor2(...), "out"), 
    (PairedDataProcessor3(...), "in"), 
    (DataProcessor4(...), "in_and_out"), 
]

This function creates the encoder scheduler that is also machine readable

enc_schedule = [
    (DataProcessor1(...), x -> get_inputs(x), "in"), 
    (DataProcessor2(...), x -> get_outputs(x), "out"), 
    (DataProcessor2(...), x -> get_outputs(x), "out"),
    (PairedDataProcessor3(...), x -> (get_outputs(x), get_outputs(x)), "in"), 
    (DataProcessor4(...), x -> get_inputs(x), "in"),
    (DataProcessor4(...), x -> get_outputs(x), "out"), 
]

and the decoder schedule is a copy of the encoder schedule reversed (and processors copied)

source
CalibrateEmulateSample.Utilities.decode_dataMethod
decode_data(
    dcp::CalibrateEmulateSample.Utilities.PairedDataContainerProcessor,
    data,
    apply_to::AbstractString
) -> Any

decodes the input or output dat (pair of columns-are-data matrices) a with the processor, based on apply_to.

source
CalibrateEmulateSample.Utilities.decode_with_scheduleMethod
decode_with_schedule(
    encoder_schedule::AbstractVector,
    data_container::EnsembleKalmanProcesses.DataContainers.DataContainer,
    in_or_out::AbstractString
) -> EnsembleKalmanProcesses.DataContainers.DataContainer

Takes in an already initialized encoder schedule, and decodes a DataContainer, the in_or_out string indicates if the data is input "in" or output "out" data (and thus decoded differently)

source
CalibrateEmulateSample.Utilities.decode_with_scheduleMethod
decode_with_schedule(
    encoder_schedule::AbstractVector,
    structure_matrix::Union{LinearAlgebra.UniformScaling, AbstractMatrix},
    in_or_out::AbstractString
) -> Any

Takes in an already initialized encoder schedule, and decodes a structure matrix, the in_or_out string indicates if the structure matrix is for input "in" or output "out" space (and thus decoded differently)

source
CalibrateEmulateSample.Utilities.decode_with_scheduleMethod
decode_with_schedule(
    encoder_schedule::AbstractVector,
    io_pairs::EnsembleKalmanProcesses.DataContainers.PairedDataContainer,
    input_structure_mat::Union{LinearAlgebra.UniformScaling, AbstractMatrix},
    output_structure_mat::Union{LinearAlgebra.UniformScaling, AbstractMatrix}
) -> Tuple{EnsembleKalmanProcesses.DataContainers.PairedDataContainer, Any, Any}

Takes in an already initialized encoder schedule, and decodes a DataContainer, and structure matrices with it, the in_or_out string indicates if the data is input "in" or output "out" data (and thus decoded differently)

source
CalibrateEmulateSample.Utilities.decorrelateMethod
decorrelate(
;
    retain_var,
    decorrelate_with
) -> CalibrateEmulateSample.Utilities.Decorrelater{Vector{Any}, Vector{Any}, Vector{Any}, Float64, String}

Constructs the Decorrelater struct. Users can add optional keyword arguments:

  • retain_var[=1.0]: to project onto the leading singular vectors such that retain_var variance is retained
  • decorrelate_with [="combined"]: from which matrix do we provide subspace directions, options are
source
CalibrateEmulateSample.Utilities.decorrelate_sample_covMethod
decorrelate_sample_cov(
;
    retain_var
) -> CalibrateEmulateSample.Utilities.Decorrelater{Vector{Any}, Vector{Any}, Vector{Any}, Float64, String}

Constructs the Decorrelater struct, setting decorrelatewith = "samplecov". Encoding data with this will ensure that the distribution of data samples after encoding will be Normal(0,I). One can additionally add keywords:

  • retain_var[=1.0]: to project onto the leading singular vectors such that retain_var variance is retained
source
CalibrateEmulateSample.Utilities.decorrelate_structure_matMethod
decorrelate_structure_mat(
;
    retain_var
) -> CalibrateEmulateSample.Utilities.Decorrelater{Vector{Any}, Vector{Any}, Vector{Any}, Float64, String}

Constructs the Decorrelater struct, setting decorrelatewith = "structuremat". This encoding will transform a provided structure matrix into I. One can additionally add keywords:

  • retain_var[=1.0]: to project onto the leading singular vectors such that retain_var variance is retained
source
CalibrateEmulateSample.Utilities.encode_with_scheduleMethod
encode_with_schedule(
    encoder_schedule::AbstractVector,
    data_container::EnsembleKalmanProcesses.DataContainers.DataContainer,
    in_or_out::AbstractString
) -> EnsembleKalmanProcesses.DataContainers.DataContainer

Takes in an already initialized encoder schedule, and encodes a DataContainer, the in_or_out string indicates if the data is input "in" or output "out" data (and thus encoded differently)

source
CalibrateEmulateSample.Utilities.encode_with_scheduleMethod
encode_with_schedule(
    encoder_schedule::AbstractVector,
    structure_matrix::Union{LinearAlgebra.UniformScaling, AbstractMatrix},
    in_or_out::AbstractString
) -> Any

Takes in an already initialized encoder schedule, and encodes a structure matrix, the in_or_out string indicates if the structure matrix is for input "in" or output "out" space (and thus encoded differently)

source
CalibrateEmulateSample.Utilities.encode_with_scheduleMethod
encode_with_schedule(
    encoder_schedule::AbstractVector,
    io_pairs::EnsembleKalmanProcesses.DataContainers.PairedDataContainer,
    input_structure_mat::Union{LinearAlgebra.UniformScaling, AbstractMatrix},
    output_structure_mat::Union{LinearAlgebra.UniformScaling, AbstractMatrix}
) -> Tuple{EnsembleKalmanProcesses.DataContainers.PairedDataContainer, Any, Any}

Takes in the created encoder schedule (See create_encoder_schedule), and initializes it, and encodes the paired data container, and structure matrices with it.

source
CalibrateEmulateSample.Utilities.get_training_pointsMethod
get_training_points(
    ekp::EnsembleKalmanProcesses.EnsembleKalmanProcess{FT, IT, P},
    train_iterations::Union{AbstractVector{IT}, IT} where IT
) -> EnsembleKalmanProcesses.DataContainers.PairedDataContainer

Extract the training points needed to train the Gaussian process regression.

  • ekp - EnsembleKalmanProcess holding the parameters and the data that were produced during the Ensemble Kalman (EK) process.
  • train_iterations - Number (or indices) EK layers/iterations to train on.
source
CalibrateEmulateSample.Utilities.initialize_and_encode_data!Method
initialize_and_encode_data!(
    dcp::CalibrateEmulateSample.Utilities.DataContainerProcessor,
    data::AbstractMatrix,
    structure_mat::Union{LinearAlgebra.UniformScaling, AbstractMatrix},
    apply_to::AbstractString
) -> Any

Initializes the DataContainerProcessor encoder (often requires data, and structure matrices), then encodes the provided columns-are-data matrix

source
CalibrateEmulateSample.Utilities.initialize_and_encode_data!Method
initialize_and_encode_data!(
    dcp::CalibrateEmulateSample.Utilities.PairedDataContainerProcessor,
    data,
    structure_mat::Union{LinearAlgebra.UniformScaling, AbstractMatrix},
    apply_to::AbstractString
) -> Any

Initializes the PairedDataContainerProcesser encoder (often requires input & output data, and structure matrices), then encodes either the input or output data (pair of columns-are-data matrices) based on apply_to.

source
CalibrateEmulateSample.Utilities.initialize_processor!Method
initialize_processor!(
    cc::CalibrateEmulateSample.Utilities.CanonicalCorrelation,
    in_data::AbstractMatrix,
    out_data::AbstractMatrix,
    structure_matrix,
    apply_to::AbstractString
) -> Any

Computes and populates the data_mean, encoder_mat, decoder_mat and apply_to fields for the CanonicalCorrelation

source
CalibrateEmulateSample.Utilities.initialize_processor!Method
initialize_processor!(
    dd::CalibrateEmulateSample.Utilities.Decorrelater,
    data::AbstractMatrix,
    structure_matrix::Union{LinearAlgebra.UniformScaling, AbstractMatrix}
) -> Any

Computes and populates the data_mean and encoder_mat and decoder_mat fields for the Decorrelater

source
CalibrateEmulateSample.Utilities.minmax_scaleMethod
minmax_scale(

) -> CalibrateEmulateSample.Utilities.ElementwiseScaler{CalibrateEmulateSample.Utilities.MinMaxScaling, Vector{Float64}}

Constructs ElementwiseScaler{MinMaxScaling} processor. As part of an encoder schedule, this will apply the transform $\frac{x - \min(x)}{\max(x) - \min(x)}$ to each data dimension.

source
CalibrateEmulateSample.Utilities.quartile_scaleMethod
quartile_scale(

) -> CalibrateEmulateSample.Utilities.ElementwiseScaler{CalibrateEmulateSample.Utilities.QuartileScaling, Vector{Float64}}

Constructs ElementwiseScaler{QuartileScaling} processor. As part of an encoder schedule, it will apply the transform $\frac{x - Q2(x)}{Q3(x) - Q1(x)}$ to each data dimension. Also known as "robust scaling"

source
CalibrateEmulateSample.Utilities.zscore_scaleMethod
zscore_scale(

) -> CalibrateEmulateSample.Utilities.ElementwiseScaler{CalibrateEmulateSample.Utilities.ZScoreScaling, Vector{Float64}}

Constructs ElementwiseScaler{ZScoreScaling} processor. As part of an encoder schedule, this will apply the transform $\frac{x-\mu}{\sigma}$, (where $x\sim N(\mu,\sigma)$), to each data dimension. For multivariate standardization, see Decorrelater

source