The Sample stage

The "sample" part of CES refers to exact sampling from the emulated posterior, in our current framework this is achieved with a Markov chain Monte Carlo algorithm (MCMC). Within this paradigm, we want to provide the flexibility to use multiple sampling algorithms; the approach we take is to use the general-purpose AbstractMCMC.jl API, provided by the Turing.jl probabilistic programming framework.

User interface

We briefly outline an instance of how one sets up and uses MCMC within the CES package. The user first loads the MCMC module, and provides one of the Protocols (i.e. how one wishes to generate sampling proposals)

using CalibrateEmulateSample.MarkovChainMonteCarlo
protocol = RWMHSampling() # Random-Walk algorithm
# protocol = pCNMHSampling() # preconditioned-Crank-Nicholson algorithm

Then one builds the MCMC by providing the standard Bayesian ingredients (prior and data) from the calibrate stage, alongside the trained statistical emulator from the emulate stage:

mcmc = MCMCWrapper(
    protocol,
    truth_sample, 
    prior,
    emulator;
    init_params=mean_u_final,
    burnin=10_000,
)

The keyword arguments init_params give a starting step of the chain (often taken to be the mean of the final iteration of calibrate stage), and a burnin gives a number of initial steps to be discarded when drawing statistics from the sampling method.

For good efficiency, one often needs to run MCMC with a problem-dependent step size. We provide a simple utility to help choose this. Here the optimizer runs short chains (of length N), and adjusts the step-size until the MCMC acceptance rate falls within an acceptable range, returning this step size.

new_step = optimize_stepsize(
mcmc;
init_stepsize = 1,
N = 2000
)

To generate $10^5$ samples with a given step size (and optional random number generator rng), one calls

chain = sample(rng, mcmc, 100_000; stepsize = new_step)
display(chain) # gives diagnostics

The return argument is stored in an MCMCChains.Chains object. To convert this back into a ParameterDistribution type (which contains e.g. the transformation maps) one can call

posterior = get_posterior(mcmc, chain)
constrained_posterior = transform_unconstrained_to_constrained(prior, get_distribution(posterior))

One can quickly plot the marginals of the prior and posterior distribution with

using Plots
plot(prior)
plot!(posterior)

or extract statistics of the (unconstrained) distribution with

mean_posterior = mean(posterior)
cov_posterior = cov(posterior)

Further details on the implementation

This page provides a summary of AbstractMCMC which augments the existing documentation ([1], [2]) and highlights how it's used by the CES package in MarkovChainMonteCarlo. It's not meant to be a complete description of the AbstractMCMC package.

Use in MarkovChainMonteCarlo

At present, Turing has limited support for derivative-free optimization, so we only use this abstract API and not Turing itself. We also use two related dependencies, AdvancedMH and MCMCChains.

Julia's philosophy is to use small, composable packages rather than monoliths, but this can make it difficult to remember where methods are defined! Below we describe the relevant parts of

  • The AbstractMCMC API,
  • Extended to the case of Metropolis-Hastings (MH) sampling by AdvancedMH,
  • Further extended for the needs of CES in Markov chain Monte Carlo.

Sampler

A Sampler is AbstractMCMC's term for an implementation of a MCMC sampling algorithm, along with all its configuration parameters. All samplers are a subtype of AbstractMCMC's AbstractSampler.

Currently CES only implements the Metropolis-Hastings (MH) algorithm. Because it's so straightforward, much of AbstractMCMC isn't needed. We implement two variants of MH with two different Samplers: RWMetropolisHastings and pCNMetropolisHastings, both of which are subtypes of AdvancedMH.MHSampler. The constructor for both Samplers is MetropolisHastingsSampler; the different Samplers are specified by passing a MCMCProtocol object to this constructor.

The MHSampler has only one field, proposal, the distribution used to generate new MH proposals via additive stochastic perturbations to the current parameter values. This is done by AdvancedMH.propose(), which gets called for each MCMC step(). The difference between Samplers comes from how the proposal is generated:

  • RWMHSampling does vanilla random-walk proposal generation with a constant, user-specified step size (this differs from the AdvancedMH implementation, which doesn't provide for a step size.)

  • pCNMHSampling for preconditioned Crank-Nicholson proposals. Vanilla random walk sampling doesn't have a well-defined limit for high-dimensional parameter spaces; pCN replaces the random walk with an Ornstein–Uhlenbeck [AR(1)] process so that the Metropolis acceptance probability remains non-zero in this limit. See Beskos et. al. (2008) and Cotter et. al. (2013).

Generated proposals are then either accepted or rejected according to the same MH criterion (in step(), below.)

Models

In Turing, the Model is the distribution one performs inference on, which may involve observed and hidden variables and parameters. For CES, we simply want to sample from the posterior, so our Model distribution is simply the emulated likelihood (see Emulators) together with the prior. This is constructed by EmulatorPosteriorModel.

Sampling with the MCMC Wrapper object

At a high level, a Sampler and Model is all that's needed to do MCMC sampling. This is done by the sample method provided by AbstractMCMC (extending the method from BaseStats).

To be more user-friendly, in CES we wrap the Sampler, Model and other necessary configuration into a MCMCWrapper object. The constructor for this object ensures that all its components are created consistently, and performs necessary bookkeeping, such as converting coordinates to the decorrelated basis. We extend sample with methods to use this object (that simply unpack its fields and call the appropriate method from AbstractMCMC.)

Chain

The MCMCChain package provides the Chains container to store the results of the MCMC sampling; the package provides methods to for quick diagnostics and plot utilities of the the Chains objects. For example,

using MCMCChains
using StatsPlots

# ... from our MCMC example above ...
# chain = sample(rng, mcmc, 100_000; stepsize = new_step)

display(chain) # diagnostics
plot(chain) # plots samples over iteration and PDFs for each parameter

Internals: Transitions

Implementing MCMC involves defining states and transitions of a Markov process (whose stationary distribution is what we seek to sample from). AbstractMCMC's terminology is a bit confusing for the MH case; states of the chain are described by Transition objects, which contain the current sample (and other information like its log-probability).

AdvancedMH defines an AbstractTransition base class for use with its methods; we implement our own child class, MCMCState, in order to record statistics on the MH acceptance ratio.

Internals: Markov steps

Markov transitions of the chain are defined by overloading AbstractMCMC's step method, which takes the Sampler and current Transition and implements the Sampler's logic to returns an updated Transition representing the chain's new state (actually, a pair of Transitions, for cases where the Sampler doesn't obey detailed balance; this isn't relevant for us).

For example, in Metropolis-Hastings sampling this is where we draw a proposal sample and accept or reject it according to the MH criterion. AdvancedMH implements this here; we re-implement this method because 1) we need to record whether a proposal was accepted or rejected, and 2) our calls to propose() are stepsize-dependent.