Gaussian Process Emulator

One type of MachineLearningTool we provide for emulation is a Gaussian process. Gaussian processes are a generalization of the Gaussian probability distribution, extended to functions rather than random variables. They can be used for statistical emulation, as they provide both mean and covariances. To build a Gaussian process, we first define a prior over all possible functions, by choosing the covariance function or kernel. The kernel describes how similar two outputs (y_i, y_j) are, given the similarities between their input values (x_i, x_j). Kernels encode the functional form of these relationships and are defined by hyperparameters, which are usually initially unknown to the user. To learn the posterior Gaussian process, we condition on data using Bayes theorem and optimize the hyperparameters of the kernel. Then, we can make predictions to predict a mean function and covariance for new data points.

A useful resource to learn about Gaussian processes is Rasmussen and Williams (2006).

User Interface

CalibrateEmulateSample.jl allows the Gaussian process emulator to be built using either GaussianProcesses.jl or ScikitLearn.jl. Different packages may be optimized for different settings, we recommend users give both a try, and checkout the individual package documentation to make a choice for their problem setting.

To use GaussianProcesses.jl, define the package type as

gppackage = Emulators.GPJL()

To use ScikitLearn.jl, define the package type as

gppackage = Emulators.SKLJL()

Initialize a basic Gaussian Process with

gauss_proc = GaussianProcess(gppackage)

This initializes the prior Gaussian process. We train the Gaussian process by feeding the gauss_proc alongside the data into the Emulator struct and optimizing the hyperparameters, described here.

Prediction Type

You can specify the type of prediction when initializing the Gaussian Process emulator. The default type of prediction is to predict data, YType(). You can create a latent function type prediction with

gauss_proc = GaussianProcess(
    gppackage,
    prediction_type = FType())

Kernels

The Gaussian process above assumes the default kernel: the Squared Exponential kernel, also called the Radial Basis Function (RBF). A different type of kernel can be specified when the Gaussian process is initialized. Read more about kernel options here.

GPJL

For the GaussianProcess.jl package, there are a range of kernels to choose from. For example,

using GaussianProcesses
my_kernel = GaussianProcesses.Mat32Iso(0., 0.)      # Create a Matern 3/2 kernel with lengthscale=0 and sd=0
gauss_proc = GaussianProcess(
    gppackage;
    kernel = my_kernel )

You do not need to provide useful hyperparameter values when you define the kernel, as these are learned in optimize_hyperparameters!(emulator).

You can also combine kernels together through linear operations, for example,

using GaussianProcesses
kernel_1 = GaussianProcesses.Mat32Iso(0., 0.)      # Create a Matern 3/2 kernel with lengthscale=0 and sd=0
kernel_2 = GaussianProcesses.Lin(0.)               # Create a linear kernel with lengthscale=0
my_kernel = kernel_1 + kernel_2                    # Create a new additive kernel
gauss_proc = GaussianProcess(
    gppackage;
    kernel = my_kernel )

Kernel hyperparameter bounds, and optimizer kwargs (GPJL)

Optim.jl is used to perform optimization, and keywords are passed in to optimize_hyperparameters!. Kernel bounds are provided by default, but can be adjusted by providing the kernbounds keyword. This should be formatted in accordance with GaussianProcesses.jl formats, for example, by using the snippet:

using Emulators
n_hparams = length(Emulators.get_params(gauss_proc)[1])
low = repeat([log(low_bound)], n_hparams) # bounds provided in log space
high = repeat([log(up_bound)], n_hparams)
# then, having built emulator  with `gauss_proc`
optimize_hyperparameters!(emulator; kernbounds=(low,high))

SKLJL

Alternatively if you are using the ScikitLearn.jl package, you can find the list of kernels here. These need this preamble, with the latest version of supported scikit-learn:

using PyCall
using ScikitLearn
const pykernels = PyNULL()
function __init__()
    copy!(pykernels, pyimport_conda("sklearn.gaussian_process.kernels", "scikit-learn=1.8.0")) 
end

Then they are accessible, for example, as

my_kernel = pykernels.RBF(length_scale = 1)
gauss_proc = GaussianProcess(
    gppackage;
    kernel = my_kernel )

You can also combine multiple ScikitLearn kernels via linear operations in the same way as above.

Kernel hyperparameter bounds (SKLJL)

Default bounds are provided, however, bounds are adjusted by providing new kernels, for example,

my_kernel = pykernels.RBF(length_scale = 1, length_scale_bounds=(low_bound, up_bound))

Scikit-Learn versions

Though our code is able to handle different Scikit Learn versions, it is a little cumbersome to update, and therefore we recommend you use the versions specified on the installation instructions page

AGPJL

Autodifferentiable emulators are used by our differentiable samplers. Currently the only support for autodifferentiable Gaussian process emulators in Julia (within the predict() method, not hyperparameter optimization) is to use AbstractGPs.jl. As AbstractGPs.jl has no optimization routines for kernels, we instead apply the following (temporary) recipe:

Create and optimize a GPJL emulator and default kernel.

gp_jl = GaussianProcess(GPJL(); noise_learn=true, gpjl_kwargs...) # must! use default kernel
em = Emulator(gp_jl, iopairs)
optimize_hyperparameters!(em) # updates gp_jl

Create the Kernel parameters as a vect-of-dict with

kernel_params = [
   Dict(
       "log_rbf_len" => model_params[1:end-2] # input-dim Vector,
       "log_std_sqexp" => model_params[end-1] # Float,
       "log_std_noise" => model_params[end],# Float,
   )
for model_params in get_params(gp_jl)]

Note: get_params(gp_jl) returns output_dim-vector where each entry is [a, b, c] with:
- a is the rbf_len: lengthscale parameters for SEArd kernel [input_dim]- length Vector
- b is the log_std_sqexp of the SQexp kernel Float
- c is the log_std_noise of the noise kernel Float
Build a new gaussian process with AGPJL, use same keyword arguments as with GPJL() and add the kernel_params kwarg into the emulator

agp_jl = GaussianProcess(AGPJL(); noise_learn=true, gpjl_kwargs...)
em = Emulator(agp_jl, iopairs; kernel_params=kernel_params)
# no call to optimize_hyperparameters

We would be keen to see contributions to our codebase to improve this interface (or to perform hyperparameter optimization with AGP directly).

Other notes about the recipe

As stated, this only works for the default kernel option.
If noise_learn=false then log_std_noise does not appear, the model_params indexing to the other hyperparameters should be adjusted
For one-dimensional output, use opt_params = Emulators.get_params(gauss_proc)[1] to get the parameters

Learning additional white noise

Often it is useful to learn the discrepancy between the Gaussian process prediction and the data, by learning additional white noise. Though one often knows, and provides, the discrepancy between the true model and data with an observational noise covariance; the additional white kernel can help account for approximation error from the selected Gaussian process kernel and the true model. This is added with the Boolean keyword noise_learn when initializing the Gaussian process. The default is true.

gauss_proc = GaussianProcess(
    gppackage;
    noise_learn = true )

When noise_learn is true, an additional white noise kernel is added to the kernel. This white noise is present across all parameter values, including the training data. The scale parameters of the white noise kernel are learned in optimize_hyperparameters!(emulator).

You may not need to learn the noise if you already have a good estimate of the noise from your training data, and if the Gaussian process kernel is well specified. When noise_learn is false, a small additional regularization is added for stability. The default value is 1e-3 but this can be chosen through the optional argument alg_reg_noise:

gauss_proc = GaussianProcess(
    gppackage;
    noise_learn = false,
    alg_reg_noise = 1e-3 )