The Emulate stage
Emulation is performed through the construction of an Emulator
object, which has two components
- A wrapper for any statistical emulator,
- Data-processing and dimensionality reduction functionality.
Typical construction from Lorenz_example.jl
First, obtain data in a PairedDataContainer
, for example, get this from an EnsembleKalmanProcess
ekpobj
generated during the Calibrate
stage, or see the constructor here
using CalibrateEmulateSample.Utilities
input_output_pairs = Utilities.get_training_points(ekpobj, 5) # use first 5 iterations as data
Wrapping a predefined machine learning tool, e.g. a Gaussian process gauss_proc
, the Emulator
can then be built:
emulator = Emulator(
gauss_proc,
input_output_pairs; # optional arguments after this
output_structure_matrix = Γy,
encoder_schedule = encoder_schedule,
)
The optional arguments above relate to the data processing, which is described here
Emulator Training
The emulator is trained when we combine the machine learning tool and the data into the Emulator
above. For any machine learning tool, hyperparameters are optimized.
optimize_hyperparameters!(emulator)
For some machine learning packages however, this may be completed during construction automatically, and for others this will not. If automatic construction took place, the optimize_hyperparameters!
line does not perform any new task, so may be safely called. In the Lorenz example, this line learns the hyperparameters of the Gaussian process, which depend on the choice of kernel, and the choice of GP package. Predictions at new inputs can then be made using
y, cov = Emulator.predict(emulator, new_inputs)
This returns both a mean value and a covariance.
Modular interface
Developers may contribute new tools by performing the following
- Create
MyMLToolName.jl
, and include "MyMLToolName.jl" inEmulators.jl
- Create a struct
MyMLTool <: MachineLearningTool
, containing any arguments or optimizer options - Create the following three methods to build, train, and predict with your tool (use
GaussianProcess.jl
as a guide)
build_models!(mlt::MyMLTool, iopairs::PairedDataContainer) -> Nothing
optimize_hyperparameters!(mlt::MyMLTool, args...; kwargs...) -> Nothing
function predict(mlt::MyMLTool, new_inputs::Matrix; kwargs...) -> Matrix, Union{Matrix, Array{,3}
The predict
method takes as input, an input_dim
-by-N_new
matrix. It return both a predicted mean and a predicted (co)variance at new inputs. (i) for scalar-output methods relying on diagonalization, return output_dim
-by-N_new
matrices for mean and variance, (ii) For vector-output methods, return output_dim
-by-N_new
for mean and output_dim
-by-output_dim
-by-N_new
for covariances.
Please get in touch with our development team when contributing new statistical emulators, to help us ensure the smoothest interface with any new tools.