RandomFeatures

Kernel and Covariance structure

CalibrateEmulateSample.Emulators.LowRankFactorType
struct LowRankFactor{FT<:AbstractFloat} <: CalibrateEmulateSample.Emulators.CovarianceStructureType

builds a covariance structure that deviates from the identity with a low-rank perturbation. This perturbation is diagonalized in the low-rank space

source
CalibrateEmulateSample.Emulators.SeparableKernelType
struct SeparableKernel{CST1<:CalibrateEmulateSample.Emulators.CovarianceStructureType, CST2<:CalibrateEmulateSample.Emulators.CovarianceStructureType} <: CalibrateEmulateSample.Emulators.KernelStructureType

Builds a separable kernel, i.e. one that accounts for input and output covariance structure separately

source
CalibrateEmulateSample.Emulators.NonseparableKernelType
struct NonseparableKernel{CST<:CalibrateEmulateSample.Emulators.CovarianceStructureType} <: CalibrateEmulateSample.Emulators.KernelStructureType

Builds a nonseparable kernel, i.e. one that accounts for a joint input and output covariance structure

source
CalibrateEmulateSample.Emulators.build_default_priorFunction
build_default_prior(
    input_dim::Int64,
    output_dim::Int64,
    kernel_structure::CalibrateEmulateSample.Emulators.SeparableKernel
) -> Any

Builds a prior distribution for the kernel hyperparameters to initialize optimization. The parameter distributions built from these priors will be scaled such that the input and output range of the data is O(1).

source

Scalar interface

CalibrateEmulateSample.Emulators.ScalarRandomFeatureInterfaceType
struct ScalarRandomFeatureInterface{S<:AbstractString, RNG<:Random.AbstractRNG, KST<:CalibrateEmulateSample.Emulators.KernelStructureType} <: CalibrateEmulateSample.Emulators.RandomFeatureInterface

Structure holding the Scalar Random Feature models.

Fields

  • rfms::Vector{RandomFeatures.Methods.RandomFeatureMethod}: vector of RandomFeatureMethods, contains the feature structure, batch-sizes and regularization

  • fitted_features::Vector{RandomFeatures.Methods.Fit}: vector of Fits, containing the matrix decomposition and coefficients of RF when fitted to data

  • batch_sizes::Union{Nothing, Dict{S, Int64}} where S<:AbstractString: batch sizes

  • n_features::Union{Nothing, Int64}: n_features

  • input_dim::Int64: input dimension

  • rng::Random.AbstractRNG: choice of random number generator

  • regularization::Vector{Union{LinearAlgebra.Diagonal, LinearAlgebra.UniformScaling, Matrix}}: regularization

  • kernel_structure::CalibrateEmulateSample.Emulators.KernelStructureType: Kernel structure type (e.g. Separable or Nonseparable)

  • feature_decomposition::AbstractString: Random Feature decomposition, choose from "svd" or "cholesky" (default)

  • optimizer_options::Dict{S} where S<:AbstractString: dictionary of options for hyperparameter optimizer

  • optimizer::Vector: diagnostics from optimizer

source
CalibrateEmulateSample.Emulators.ScalarRandomFeatureInterfaceMethod
ScalarRandomFeatureInterface(
    n_features::Int64,
    input_dim::Int64;
    kernel_structure,
    batch_sizes,
    rng,
    feature_decomposition,
    optimizer_options
) -> CalibrateEmulateSample.Emulators.ScalarRandomFeatureInterface{String, Random.TaskLocalRNG, CalibrateEmulateSample.Emulators.SeparableKernel{CST1, CalibrateEmulateSample.Emulators.OneDimFactor}} where CST1<:CalibrateEmulateSample.Emulators.CovarianceStructureType

Constructs a ScalarRandomFeatureInterface <: MachineLearningTool interface for the RandomFeatures.jl package for multi-input and single- (or decorrelated-)output emulators.

  • n_features - the number of random features
  • input_dim - the dimension of the input space
  • kernel_structure - - a prescribed form of kernel structure
  • batch_sizes = nothing - Dictionary of batch sizes passed RandomFeatures.jl object (see definition there)
  • rng = Random.GLOBAL_RNG - random number generator
  • feature_decomposition = "cholesky" - choice of how to store decompositions of random features, cholesky or svd available
  • optimizer_options = nothing - Dict of options to pass into EKI optimization of hyperparameters (defaults created in ScalarRandomFeatureInterface constructor):
    • "prior": the prior for the hyperparameter optimization
    • "n_ensemble": number of ensemble members
    • "n_iteration": number of eki iterations
    • "covsamplemultiplier": increase for more samples to estimate covariance matrix in optimization (default 10.0, minimum 0.0)
    • "scheduler": Learning rate Scheduler (a.k.a. EKP timestepper) Default: DataMisfitController
    • "inflation": additive inflation ∈ [0,1] with 0 being no inflation
    • "train_fraction": e.g. 0.8 (default) means 80:20 train - test split
    • "nfeaturesopt": fix the number of features for optimization (default n_features, as used for prediction)
    • "multithread": how to multithread. "ensemble" (default) threads across ensemble members "tullio" threads random feature matrix algebra
    • "accelerator": use EKP accelerators (default is no acceleration)
    • "verbose" => false: verbose optimizer statements
    • "covcorrection" => "nice": type of conditioning to improve estimated covariance. "shrinkage", "shrinkagecorr" (Ledoit Wolfe 03), "nice" for (Vishny, Morzfeld et al. 2024)
    • "overfit" => 1.0: if > 1.0 forcibly overfit/under-regularize the optimizer cost, (vice versa for < 1.0).
    • "ncrossvalsets" => 2: train fraction creates (default 5) train-test data subsets, then use 'ncrossvalsets' of these stacked in the loss function. If set to 0, train=test on the full data provided ignoring "train_fraction".
source
CalibrateEmulateSample.Emulators.build_models!Method
build_models!(
    srfi::CalibrateEmulateSample.Emulators.ScalarRandomFeatureInterface,
    input_output_pairs::EnsembleKalmanProcesses.DataContainers.PairedDataContainer{FT<:AbstractFloat},
    input_structure_mats,
    output_structure_mats
) -> Union{Nothing, Vector}

Builds the random feature method from hyperparameters. We use cosine activation functions and a Multivariate Normal distribution (from Distributions.jl) with mean M=0, and input covariance U built with the CovarianceStructureType.

source
GaussianProcesses.predictMethod
predict(
    srfi::CalibrateEmulateSample.Emulators.ScalarRandomFeatureInterface,
    new_inputs::AbstractMatrix;
    multithread
) -> Tuple{Any, Any}

Prediction of emulator mean at new inputs (passed in as columns in a matrix), and a prediction of the total covariance at new inputs equal to (emulator covariance + noise covariance).

source

Vector Interface

CalibrateEmulateSample.Emulators.VectorRandomFeatureInterfaceType
struct VectorRandomFeatureInterface{S<:AbstractString, RNG<:Random.AbstractRNG, KST<:CalibrateEmulateSample.Emulators.KernelStructureType} <: CalibrateEmulateSample.Emulators.RandomFeatureInterface

Structure holding the Vector Random Feature models.

Fields

  • rfms::Vector{RandomFeatures.Methods.RandomFeatureMethod}: A vector of RandomFeatureMethods, contains the feature structure, batch-sizes and regularization

  • fitted_features::Vector{RandomFeatures.Methods.Fit}: vector of Fits, containing the matrix decomposition and coefficients of RF when fitted to data

  • batch_sizes::Union{Nothing, Dict{S, Int64}} where S<:AbstractString: batch sizes

  • n_features::Union{Nothing, Int64}: number of features

  • input_dim::Int64: input dimension

  • output_dim::Int64: output_dimension

  • rng::Random.AbstractRNG: rng

  • regularization::Vector{Union{LinearAlgebra.Diagonal, LinearAlgebra.UniformScaling, Matrix}}: regularization

  • kernel_structure::CalibrateEmulateSample.Emulators.KernelStructureType: Kernel structure type (e.g. Separable or Nonseparable)

  • feature_decomposition::AbstractString: Random Feature decomposition, choose from "svd" or "cholesky" (default)

  • optimizer_options::Dict: dictionary of options for hyperparameter optimizer

  • optimizer::Vector: diagnostics from optimizer

source
CalibrateEmulateSample.Emulators.VectorRandomFeatureInterfaceMethod
VectorRandomFeatureInterface(
    n_features::Int64,
    input_dim::Int64,
    output_dim::Int64;
    kernel_structure,
    batch_sizes,
    rng,
    feature_decomposition,
    optimizer_options
) -> CalibrateEmulateSample.Emulators.VectorRandomFeatureInterface{String, Random.TaskLocalRNG, CalibrateEmulateSample.Emulators.SeparableKernel{CST1, CST2}} where {CST1<:Union{CalibrateEmulateSample.Emulators.OneDimFactor, CalibrateEmulateSample.Emulators.CholeskyFactor{Float64}, CalibrateEmulateSample.Emulators.DiagonalFactor{Float64}, CalibrateEmulateSample.Emulators.HierarchicalLowRankFactor{Float64}, CalibrateEmulateSample.Emulators.LowRankFactor{Float64}}, CST2<:Union{CalibrateEmulateSample.Emulators.OneDimFactor, CalibrateEmulateSample.Emulators.CholeskyFactor{Float64}, CalibrateEmulateSample.Emulators.DiagonalFactor{Float64}, CalibrateEmulateSample.Emulators.HierarchicalLowRankFactor{Float64}, CalibrateEmulateSample.Emulators.LowRankFactor{Float64}}}

Constructs a VectorRandomFeatureInterface <: MachineLearningTool interface for the RandomFeatures.jl package for multi-input and multi-output emulators.

  • n_features - the number of random features
  • input_dim - the dimension of the input space
  • output_dim - the dimension of the output space
  • kernel_structure - - a prescribed form of kernel structure
  • batch_sizes = nothing - Dictionary of batch sizes passed RandomFeatures.jl object (see definition there)
  • rng = Random.GLOBAL_RNG - random number generator
  • feature_decomposition = "cholesky" - choice of how to store decompositions of random features, cholesky or svd available
  • optimizer_options = nothing - Dict of options to pass into EKI optimization of hyperparameters (defaults created in VectorRandomFeatureInterface constructor):
    • "prior": the prior for the hyperparameter optimization
    • "priorinscale"/"prioroutscale": use these to tune the input/output prior scale.
    • "n_ensemble": number of ensemble members
    • "n_iteration": number of eki iterations
    • "scheduler": Learning rate Scheduler (a.k.a. EKP timestepper) Default: DataMisfitController
    • "covsamplemultiplier": increase for more samples to estimate covariance matrix in optimization (default 10.0, minimum 0.0)
    • "inflation": additive inflation ∈ [0,1] with 0 being no inflation
    • "train_fraction": e.g. 0.8 (default) means 80:20 train - test split
    • "nfeaturesopt": fix the number of features for optimization (default n_features, as used for prediction)
    • "multithread": how to multithread. "ensemble" (default) threads across ensemble members "tullio" threads random feature matrix algebra
    • "accelerator": use EKP accelerators (default is no acceleration)
    • "verbose" => false, verbose optimizer statements to check convergence, priors and optimal parameters.
    • "covcorrection" => "nice": type of conditioning to improve estimated covariance. "shrinkage", "shrinkagecorr" (Ledoit Wolfe 03), "nice" for (Vishny, Morzfeld et al. 2024)
    • "overfit" => 1.0: if > 1.0 forcibly overfit/under-regularize the optimizer cost, (vice versa for < 1.0).
    • "ncrossvalsets" => 2, train fraction creates (default 5) train-test data subsets, then use 'ncrossvalsets' of these stacked in the loss function. If set to 0, train=test on the full data provided ignoring "train_fraction".
source
CalibrateEmulateSample.Emulators.build_models!Method
build_models!(
    vrfi::CalibrateEmulateSample.Emulators.VectorRandomFeatureInterface,
    input_output_pairs::EnsembleKalmanProcesses.DataContainers.PairedDataContainer{FT<:AbstractFloat},
    input_structure_mats,
    output_structure_mats
) -> Union{Nothing, Vector{Union{LinearAlgebra.Diagonal, LinearAlgebra.UniformScaling, Matrix}}}

Build Vector Random Feature model for the input-output pairs subject to regularization, and optimizes the hyperparameters with EKP.

source
GaussianProcesses.predictMethod
predict(
    vrfi::CalibrateEmulateSample.Emulators.VectorRandomFeatureInterface,
    new_inputs::AbstractMatrix
) -> Tuple{Any, Any}

Prediction of data observation (not latent function) at new inputs (passed in as columns in a matrix). That is, we add the observational noise into predictions.

source

Other utilities

CalibrateEmulateSample.Emulators.get_rfmsFunction
get_rfms(
    srfi::CalibrateEmulateSample.Emulators.ScalarRandomFeatureInterface
) -> Vector{RandomFeatures.Methods.RandomFeatureMethod}

gets the rfms field

source
get_rfms(
    vrfi::CalibrateEmulateSample.Emulators.VectorRandomFeatureInterface
) -> Vector{RandomFeatures.Methods.RandomFeatureMethod}

Gets the rfms field

source
CalibrateEmulateSample.Emulators.get_fitted_featuresFunction
get_fitted_features(
    srfi::CalibrateEmulateSample.Emulators.ScalarRandomFeatureInterface
) -> Vector{RandomFeatures.Methods.Fit}

gets the fitted_features field

source
get_fitted_features(
    vrfi::CalibrateEmulateSample.Emulators.VectorRandomFeatureInterface
) -> Vector{RandomFeatures.Methods.Fit}

Gets the fitted_features field

source
CalibrateEmulateSample.Emulators.get_batch_sizesFunction
get_batch_sizes(
    srfi::CalibrateEmulateSample.Emulators.ScalarRandomFeatureInterface
) -> Union{Nothing, Dict{S, Int64}} where S<:AbstractString

gets batch_sizes the field

source
get_batch_sizes(
    vrfi::CalibrateEmulateSample.Emulators.VectorRandomFeatureInterface
) -> Union{Nothing, Dict{S, Int64}} where S<:AbstractString

Gets the batch_sizes field

source
CalibrateEmulateSample.Emulators.get_n_featuresFunction
get_n_features(
    srfi::CalibrateEmulateSample.Emulators.ScalarRandomFeatureInterface
) -> Union{Nothing, Int64}

gets the n_features field

source
get_n_features(
    vrfi::CalibrateEmulateSample.Emulators.VectorRandomFeatureInterface
) -> Union{Nothing, Int64}

Gets the n_features field

source
CalibrateEmulateSample.Emulators.get_input_dimFunction
get_input_dim(
    srfi::CalibrateEmulateSample.Emulators.ScalarRandomFeatureInterface
) -> Int64

gets the input_dim field

source
get_input_dim(
    vrfi::CalibrateEmulateSample.Emulators.VectorRandomFeatureInterface
) -> Int64

Gets the input_dim field

source
EnsembleKalmanProcesses.get_rngFunction
get_rng(
    srfi::CalibrateEmulateSample.Emulators.ScalarRandomFeatureInterface
) -> Random.AbstractRNG

gets the rng field

source
get_rng(
    vrfi::CalibrateEmulateSample.Emulators.VectorRandomFeatureInterface
) -> Random.AbstractRNG

Gets the rng field

source
CalibrateEmulateSample.Emulators.get_kernel_structureFunction
get_kernel_structure(
    srfi::CalibrateEmulateSample.Emulators.ScalarRandomFeatureInterface
) -> CalibrateEmulateSample.Emulators.KernelStructureType

Gets the kernel_structure field

source
get_kernel_structure(
    vrfi::CalibrateEmulateSample.Emulators.VectorRandomFeatureInterface
) -> CalibrateEmulateSample.Emulators.KernelStructureType

Gets the kernel_structure field

source
CalibrateEmulateSample.Emulators.get_feature_decompositionFunction
get_feature_decomposition(
    srfi::CalibrateEmulateSample.Emulators.ScalarRandomFeatureInterface
) -> AbstractString

gets the feature_decomposition field

source
get_feature_decomposition(
    vrfi::CalibrateEmulateSample.Emulators.VectorRandomFeatureInterface
) -> AbstractString

Gets the feature_decomposition field

source
CalibrateEmulateSample.Emulators.get_optimizer_optionsFunction
get_optimizer_options(
    srfi::CalibrateEmulateSample.Emulators.ScalarRandomFeatureInterface
) -> Dict{S} where S<:AbstractString

gets the optimizer_options field

source
get_optimizer_options(
    vrfi::CalibrateEmulateSample.Emulators.VectorRandomFeatureInterface
) -> Dict

Gets the optimizer_options field

source
CalibrateEmulateSample.Emulators.shrinkage_covFunction
shrinkage_cov(
    sample_mat::AbstractMatrix;
    cov_or_corr,
    verbose
) -> Any

Calculate the empirical covariance, additionally applying a shrinkage operator (here the Ledoit Wolf 2004 shrinkage operation). Known to have better stability properties than Monte-Carlo for low sample sizes

source