Observations

The Observations object facilitates convenient storing, grouping and minibatching over observations.

The key objects

  1. The Observation is a container for an observed variables ("samples"), their noise covariances ("covariances"), and names ("names"). They are easily stackable to help build larger heterogenous observations
  2. The Minibatcher facilitate data streaming (minibatching), where a user can submit large group of observations, that are then batched up and looped over in epochs.
  3. The ObservationSeries contains the list of Observations and Minibatcher and the utilities to get the current batch etc.
I usually just pass in a vector of data and a covariance to EKP

Users can indeed set up an experiment with just one data sample and covariance matrix for the noise. However internally these are still stored as an ObservationSeries with a special minibatcher that does nothing (created by no_minibatcher(size)).

How should I provide the noise covariance?

We provide some utilities and API for providing other forms of covariance than AbstractMatrix. For example, in high-dimensional problems one may wish to provide compact low-rank representations. See the section below for more details.

Here the user has data for two independent variables: the five-dimensional y and the eight-dimensional z. The observational noise of y is uncorrelated in all components, while the observations of z there is a known correlation.

We recommend users build an Observation using the Dict constructor and make use of the combine_observations() utility.

using EnsembleKalmanProcesses # for `Observation`
using LinearAlgebra # for `I`, `Tridiagonal`


# specify an observation of y with diagonal noise covariance
ydim = 5
y = ones(ydim)
cov_y = 0.01*I

# specify an observation of z with tridiagonal noise covariance
zdim = 8
z = zeros(zdim)
cov_z = Tridiagonal(0.1*ones(zdim-1), ones(zdim), 0.1*ones(zdim-1))

y_obs = Observation(
    Dict(
        "samples" => y,
        "covariances" => cov_y,
        "names" => "y",
    ),
)

z_obs = Observation(
    Dict(
        "samples" => z,
        "covariances" => cov_z,
        "names" => "z",
    ),
)

full_obs = combine_observations([y_obs,z_obs]) # conveniently stacks the observations
Observation{Vector{Vector{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{String}, Vector{UnitRange{Int64}}}([[1.0, 1.0, 1.0, 1.0, 1.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]], AbstractMatrix{Float64}[[0.01 0.0 … 0.0 0.0; 0.0 0.01 … 0.0 0.0; … ; 0.0 0.0 … 0.01 0.0; 0.0 0.0 … 0.0 0.01], [1.0 0.1 … 0.0 0.0; 0.1 1.0 … 0.0 0.0; … ; 0.0 0.0 … 1.0 0.1; 0.0 0.0 … 0.1 1.0]], AbstractMatrix{Float64}[[100.0 0.0 … 0.0 0.0; 0.0 100.0 … 0.0 0.0; … ; 0.0 0.0 … 100.0 0.0; 0.0 0.0 … 0.0 100.0], [1.010205144336438 -0.1020514433643792 … 1.0735488188434786e-6 -1.0735488188434786e-7; -0.10205144336437919 1.020514433643792 … -1.0735488188434786e-5 1.0735488188434786e-6; … ; 1.0735488188434788e-6 -1.0735488188434786e-5 … 1.020514433643792 -0.10205144336437919; -1.073548818843479e-7 1.0735488188434788e-6 … -0.10205144336437919 1.010205144336438]], ["y", "z"], UnitRange{Int64}[1:5, 6:13])
# getting information out
get_obs(full_obs) # returns [y,z]
13-element Vector{Float64}:
 1.0
 1.0
 1.0
 1.0
 1.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
get_obs_noise_cov(full_obs) # returns block-diagonal matrix with blocks [cov_y 0; 0 cov_z]
13×13 Matrix{Float64}:
 0.01  0.0   0.0   0.0   0.0   0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0   0.01  0.0   0.0   0.0   0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0   0.0   0.01  0.0   0.0   0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0   0.0   0.0   0.01  0.0   0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0   0.0   0.0   0.0   0.01  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0   0.0   0.0   0.0   0.0   1.0  0.1  0.0  0.0  0.0  0.0  0.0  0.0
 0.0   0.0   0.0   0.0   0.0   0.1  1.0  0.1  0.0  0.0  0.0  0.0  0.0
 0.0   0.0   0.0   0.0   0.0   0.0  0.1  1.0  0.1  0.0  0.0  0.0  0.0
 0.0   0.0   0.0   0.0   0.0   0.0  0.0  0.1  1.0  0.1  0.0  0.0  0.0
 0.0   0.0   0.0   0.0   0.0   0.0  0.0  0.0  0.1  1.0  0.1  0.0  0.0
 0.0   0.0   0.0   0.0   0.0   0.0  0.0  0.0  0.0  0.1  1.0  0.1  0.0
 0.0   0.0   0.0   0.0   0.0   0.0  0.0  0.0  0.0  0.0  0.1  1.0  0.1
 0.0   0.0   0.0   0.0   0.0   0.0  0.0  0.0  0.0  0.0  0.0  0.1  1.0

getters get_* can be used for all internals,

get_names(full_obs) # returns ["y", "z"]
2-element Vector{String}:
 "y"
 "z"

There are some other fields stored such as indices of the y and z components

get_indices(full_obs) # returns [1:ydim, ydim+1:ydim+zdim]
2-element Vector{UnitRange{Int64}}:
 1:5
 6:13

Imagine the user has 100 independent data samples for two independent variables above Rather than stacking all the data together at once (forming a full system of size 100*(8+5) to update at each step) instead the user wishes to stream the data and do updates with random batches of 5 observations at each iteration.

Why would I choose to minibatch?

The memory- and time-scaling of many EKP methods is worse-than-linear in the observation dimension, therefore there is often large computational benefit to minibatch EKP updates. Such costs must be weighed against the cost of additional forward map evaluations needed to minibatching over one full epoch.

# given a vector of 100 `Observation`s called hundred_full_obs,
using EnsembleKalmanProcesses # for `RandomFixedSizeMinibatcher`, `ObservationSeries`, `Minibatcher`

minibatcher = RandomFixedSizeMinibatcher(5) # batches the epoch of size 100, into batches of size 5

observation_series = ObservationSeries(
    Dict(
        "observations" => hundred_full_obs,
        "minibatcher" => minibatcher,
        "metadata" => "optional metadata information in any format",
    ),
)
ObservationSeries{Vector{Observation{Vector{Vector{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{String}, Vector{UnitRange{Int64}}}}, RandomFixedSizeMinibatcher{String, Random.TaskLocalRNG, Vector{Vector{Int64}}}, Vector{String}, Vector{Vector{Vector{Int64}}}, String}(Observation{Vector{Vector{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{String}, Vector{UnitRange{Int64}}}[Observation{Vector{Vector{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{String}, Vector{UnitRange{Int64}}}([[0.9572163878015774, 1.0826060541709634, 1.0494025339721254, 0.8014344526967311, 1.0274429761889114], [0.34754242408430114, 1.3669533875843585, 1.0572201000586834, 0.315948804122402, 0.5358624725721696, -0.1563445377585743, -0.5916721882952035, 1.6271884321444134]], AbstractMatrix{Float64}[[0.01 0.0 … 0.0 0.0; 0.0 0.01 … 0.0 0.0; … ; 0.0 0.0 … 0.01 0.0; 0.0 0.0 … 0.0 0.01], [1.0 0.1 … 0.0 0.0; 0.1 1.0 … 0.0 0.0; … ; 0.0 0.0 … 1.0 0.1; 0.0 0.0 … 0.1 1.0]], AbstractMatrix{Float64}[[100.0 0.0 … 0.0 0.0; 0.0 100.0 … 0.0 0.0; … ; 0.0 0.0 … 100.0 0.0; 0.0 0.0 … 0.0 100.0], [1.010205144336438 -0.1020514433643792 … 1.0735488188434786e-6 -1.0735488188434786e-7; -0.10205144336437919 1.020514433643792 … -1.0735488188434786e-5 1.0735488188434786e-6; … ; 1.0735488188434788e-6 -1.0735488188434786e-5 … 1.020514433643792 -0.10205144336437919; -1.073548818843479e-7 1.0735488188434788e-6 … -0.10205144336437919 1.010205144336438]], ["y_1", "z_1"], UnitRange{Int64}[1:5, 6:13]), Observation{Vector{Vector{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{String}, Vector{UnitRange{Int64}}}([[0.9368044137973714, 0.8413328647194948, 0.9781984074539763, 1.1060755304009238, 1.0286166056741242], [0.32300267859267684, 1.5962611163823932, 0.71531197433613, -1.171667930118803, -0.8771120182849365, -0.8943621154728998, 0.538123335348984, -0.8550511724102517]], AbstractMatrix{Float64}[[0.01 0.0 … 0.0 0.0; 0.0 0.01 … 0.0 0.0; … ; 0.0 0.0 … 0.01 0.0; 0.0 0.0 … 0.0 0.01], [1.0 0.1 … 0.0 0.0; 0.1 1.0 … 0.0 0.0; … ; 0.0 0.0 … 1.0 0.1; 0.0 0.0 … 0.1 1.0]], AbstractMatrix{Float64}[[100.0 0.0 … 0.0 0.0; 0.0 100.0 … 0.0 0.0; … ; 0.0 0.0 … 100.0 0.0; 0.0 0.0 … 0.0 100.0], [1.010205144336438 -0.1020514433643792 … 1.0735488188434786e-6 -1.0735488188434786e-7; -0.10205144336437919 1.020514433643792 … -1.0735488188434786e-5 1.0735488188434786e-6; … ; 1.0735488188434788e-6 -1.0735488188434786e-5 … 1.020514433643792 -0.10205144336437919; -1.073548818843479e-7 1.0735488188434788e-6 … -0.10205144336437919 1.010205144336438]], ["y_2", "z_2"], UnitRange{Int64}[1:5, 6:13]), Observation{Vector{Vector{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{String}, Vector{UnitRange{Int64}}}([[1.0021931868237497, 1.2597660545229354, 1.0868826022193845, 1.074618001417115, 0.9524233052781181], [1.233895658972388, -2.1571195406097714, 0.2538155093384191, 0.5901392178075241, -0.750476746567803, -1.9397009227647204, -0.7323521326226214, -1.6425238667443223]], AbstractMatrix{Float64}[[0.01 0.0 … 0.0 0.0; 0.0 0.01 … 0.0 0.0; … ; 0.0 0.0 … 0.01 0.0; 0.0 0.0 … 0.0 0.01], [1.0 0.1 … 0.0 0.0; 0.1 1.0 … 0.0 0.0; … ; 0.0 0.0 … 1.0 0.1; 0.0 0.0 … 0.1 1.0]], AbstractMatrix{Float64}[[100.0 0.0 … 0.0 0.0; 0.0 100.0 … 0.0 0.0; … ; 0.0 0.0 … 100.0 0.0; 0.0 0.0 … 0.0 100.0], [1.010205144336438 -0.1020514433643792 … 1.0735488188434786e-6 -1.0735488188434786e-7; -0.10205144336437919 1.020514433643792 … -1.0735488188434786e-5 1.0735488188434786e-6; … ; 1.0735488188434788e-6 -1.0735488188434786e-5 … 1.020514433643792 -0.10205144336437919; -1.073548818843479e-7 1.0735488188434788e-6 … -0.10205144336437919 1.010205144336438]], ["y_3", "z_3"], UnitRange{Int64}[1:5, 6:13]), Observation{Vector{Vector{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{String}, Vector{UnitRange{Int64}}}([[1.130857721331914, 0.8580874132437099, 1.0984868421372889, 1.0238879201492275, 0.9289348083109492], [0.8315360116923314, -0.666201583459936, 0.31216309194213265, 1.6332527756516897, -0.1233097345911878, -0.9593665682938097, 1.5210572327989975, -0.45034596496372636]], AbstractMatrix{Float64}[[0.01 0.0 … 0.0 0.0; 0.0 0.01 … 0.0 0.0; … ; 0.0 0.0 … 0.01 0.0; 0.0 0.0 … 0.0 0.01], [1.0 0.1 … 0.0 0.0; 0.1 1.0 … 0.0 0.0; … ; 0.0 0.0 … 1.0 0.1; 0.0 0.0 … 0.1 1.0]], AbstractMatrix{Float64}[[100.0 0.0 … 0.0 0.0; 0.0 100.0 … 0.0 0.0; … ; 0.0 0.0 … 100.0 0.0; 0.0 0.0 … 0.0 100.0], [1.010205144336438 -0.1020514433643792 … 1.0735488188434786e-6 -1.0735488188434786e-7; -0.10205144336437919 1.020514433643792 … -1.0735488188434786e-5 1.0735488188434786e-6; … ; 1.0735488188434788e-6 -1.0735488188434786e-5 … 1.020514433643792 -0.10205144336437919; -1.073548818843479e-7 1.0735488188434788e-6 … -0.10205144336437919 1.010205144336438]], ["y_4", "z_4"], UnitRange{Int64}[1:5, 6:13]), Observation{Vector{Vector{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{String}, Vector{UnitRange{Int64}}}([[1.0881724170473819, 0.7895397805194219, 1.075160411078153, 1.0321874576160535, 1.0856244226179066], [1.8878136186876435, 0.6887323916172075, -0.7381811567741609, -1.2709947222110072, 1.0899146764545686, 3.1129833524846005, -0.1215419050799873, 0.12513937571220535]], AbstractMatrix{Float64}[[0.01 0.0 … 0.0 0.0; 0.0 0.01 … 0.0 0.0; … ; 0.0 0.0 … 0.01 0.0; 0.0 0.0 … 0.0 0.01], [1.0 0.1 … 0.0 0.0; 0.1 1.0 … 0.0 0.0; … ; 0.0 0.0 … 1.0 0.1; 0.0 0.0 … 0.1 1.0]], AbstractMatrix{Float64}[[100.0 0.0 … 0.0 0.0; 0.0 100.0 … 0.0 0.0; … ; 0.0 0.0 … 100.0 0.0; 0.0 0.0 … 0.0 100.0], [1.010205144336438 -0.1020514433643792 … 1.0735488188434786e-6 -1.0735488188434786e-7; -0.10205144336437919 1.020514433643792 … -1.0735488188434786e-5 1.0735488188434786e-6; … ; 1.0735488188434788e-6 -1.0735488188434786e-5 … 1.020514433643792 -0.10205144336437919; -1.073548818843479e-7 1.0735488188434788e-6 … -0.10205144336437919 1.010205144336438]], ["y_5", "z_5"], UnitRange{Int64}[1:5, 6:13]), Observation{Vector{Vector{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{String}, Vector{UnitRange{Int64}}}([[1.0967695343350652, 1.1078236957350243, 1.0682781908528611, 0.7747080818078257, 0.8809861395928562], [-0.398114145777934, 1.4765476164286089, -0.12267690698166128, -1.3450130321297724, 0.622471608496949, -0.8300933272512985, 0.8326352538795679, 0.27073452051891217]], AbstractMatrix{Float64}[[0.01 0.0 … 0.0 0.0; 0.0 0.01 … 0.0 0.0; … ; 0.0 0.0 … 0.01 0.0; 0.0 0.0 … 0.0 0.01], [1.0 0.1 … 0.0 0.0; 0.1 1.0 … 0.0 0.0; … ; 0.0 0.0 … 1.0 0.1; 0.0 0.0 … 0.1 1.0]], AbstractMatrix{Float64}[[100.0 0.0 … 0.0 0.0; 0.0 100.0 … 0.0 0.0; … ; 0.0 0.0 … 100.0 0.0; 0.0 0.0 … 0.0 100.0], [1.010205144336438 -0.1020514433643792 … 1.0735488188434786e-6 -1.0735488188434786e-7; -0.10205144336437919 1.020514433643792 … -1.0735488188434786e-5 1.0735488188434786e-6; … ; 1.0735488188434788e-6 -1.0735488188434786e-5 … 1.020514433643792 -0.10205144336437919; -1.073548818843479e-7 1.0735488188434788e-6 … -0.10205144336437919 1.010205144336438]], ["y_6", "z_6"], UnitRange{Int64}[1:5, 6:13]), Observation{Vector{Vector{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{String}, Vector{UnitRange{Int64}}}([[0.7531994931289526, 1.2491640582840784, 1.0381670663431029, 1.0024347862581362, 1.0787777385552775], [-0.567970436331456, -0.7160987047983811, -0.35761585245947314, -0.7783804527495005, -0.8420505945433199, 0.5035664072961192, 1.0133900406287275, -0.24405729469195103]], AbstractMatrix{Float64}[[0.01 0.0 … 0.0 0.0; 0.0 0.01 … 0.0 0.0; … ; 0.0 0.0 … 0.01 0.0; 0.0 0.0 … 0.0 0.01], [1.0 0.1 … 0.0 0.0; 0.1 1.0 … 0.0 0.0; … ; 0.0 0.0 … 1.0 0.1; 0.0 0.0 … 0.1 1.0]], AbstractMatrix{Float64}[[100.0 0.0 … 0.0 0.0; 0.0 100.0 … 0.0 0.0; … ; 0.0 0.0 … 100.0 0.0; 0.0 0.0 … 0.0 100.0], [1.010205144336438 -0.1020514433643792 … 1.0735488188434786e-6 -1.0735488188434786e-7; -0.10205144336437919 1.020514433643792 … -1.0735488188434786e-5 1.0735488188434786e-6; … ; 1.0735488188434788e-6 -1.0735488188434786e-5 … 1.020514433643792 -0.10205144336437919; -1.073548818843479e-7 1.0735488188434788e-6 … -0.10205144336437919 1.010205144336438]], ["y_7", "z_7"], UnitRange{Int64}[1:5, 6:13]), Observation{Vector{Vector{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{String}, Vector{UnitRange{Int64}}}([[1.0409756793752012, 0.9399205488297026, 0.9043723869283378, 1.1046506294238354, 0.781379919875765], [-0.46061562722618865, -0.06331915362556345, 0.759196624933249, 2.4913613980434617, 0.5697161729358413, 1.5992400327350293, 0.35602644652787685, -0.15141559308149377]], AbstractMatrix{Float64}[[0.01 0.0 … 0.0 0.0; 0.0 0.01 … 0.0 0.0; … ; 0.0 0.0 … 0.01 0.0; 0.0 0.0 … 0.0 0.01], [1.0 0.1 … 0.0 0.0; 0.1 1.0 … 0.0 0.0; … ; 0.0 0.0 … 1.0 0.1; 0.0 0.0 … 0.1 1.0]], AbstractMatrix{Float64}[[100.0 0.0 … 0.0 0.0; 0.0 100.0 … 0.0 0.0; … ; 0.0 0.0 … 100.0 0.0; 0.0 0.0 … 0.0 100.0], [1.010205144336438 -0.1020514433643792 … 1.0735488188434786e-6 -1.0735488188434786e-7; -0.10205144336437919 1.020514433643792 … -1.0735488188434786e-5 1.0735488188434786e-6; … ; 1.0735488188434788e-6 -1.0735488188434786e-5 … 1.020514433643792 -0.10205144336437919; -1.073548818843479e-7 1.0735488188434788e-6 … -0.10205144336437919 1.010205144336438]], ["y_8", "z_8"], UnitRange{Int64}[1:5, 6:13]), Observation{Vector{Vector{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{String}, Vector{UnitRange{Int64}}}([[0.9025099446158029, 1.0892966779801907, 0.8882937381033444, 0.9574733194180067, 1.0248578774139445], [1.3908107752300822, -0.12067338374990169, 1.2593275053078663, 0.5834714826813894, 1.1027008574729427, -0.06737107628963644, 0.19390427489966738, -1.4129941999822362]], AbstractMatrix{Float64}[[0.01 0.0 … 0.0 0.0; 0.0 0.01 … 0.0 0.0; … ; 0.0 0.0 … 0.01 0.0; 0.0 0.0 … 0.0 0.01], [1.0 0.1 … 0.0 0.0; 0.1 1.0 … 0.0 0.0; … ; 0.0 0.0 … 1.0 0.1; 0.0 0.0 … 0.1 1.0]], AbstractMatrix{Float64}[[100.0 0.0 … 0.0 0.0; 0.0 100.0 … 0.0 0.0; … ; 0.0 0.0 … 100.0 0.0; 0.0 0.0 … 0.0 100.0], [1.010205144336438 -0.1020514433643792 … 1.0735488188434786e-6 -1.0735488188434786e-7; -0.10205144336437919 1.020514433643792 … -1.0735488188434786e-5 1.0735488188434786e-6; … ; 1.0735488188434788e-6 -1.0735488188434786e-5 … 1.020514433643792 -0.10205144336437919; -1.073548818843479e-7 1.0735488188434788e-6 … -0.10205144336437919 1.010205144336438]], ["y_9", "z_9"], UnitRange{Int64}[1:5, 6:13]), Observation{Vector{Vector{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{String}, Vector{UnitRange{Int64}}}([[0.9859986074471804, 0.8553419371815361, 1.0746391970009663, 0.9643030542224047, 0.9193687827337313], [0.3235241372020836, 0.7528626087681446, 1.1334406272962998, -0.30128836984541, 0.8121145056048603, 0.4508815259336485, -1.0836295531498443, -1.4066107621971156]], AbstractMatrix{Float64}[[0.01 0.0 … 0.0 0.0; 0.0 0.01 … 0.0 0.0; … ; 0.0 0.0 … 0.01 0.0; 0.0 0.0 … 0.0 0.01], [1.0 0.1 … 0.0 0.0; 0.1 1.0 … 0.0 0.0; … ; 0.0 0.0 … 1.0 0.1; 0.0 0.0 … 0.1 1.0]], AbstractMatrix{Float64}[[100.0 0.0 … 0.0 0.0; 0.0 100.0 … 0.0 0.0; … ; 0.0 0.0 … 100.0 0.0; 0.0 0.0 … 0.0 100.0], [1.010205144336438 -0.1020514433643792 … 1.0735488188434786e-6 -1.0735488188434786e-7; -0.10205144336437919 1.020514433643792 … -1.0735488188434786e-5 1.0735488188434786e-6; … ; 1.0735488188434788e-6 -1.0735488188434786e-5 … 1.020514433643792 -0.10205144336437919; -1.073548818843479e-7 1.0735488188434788e-6 … -0.10205144336437919 1.010205144336438]], ["y_10", "z_10"], UnitRange{Int64}[1:5, 6:13])  …  Observation{Vector{Vector{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{String}, Vector{UnitRange{Int64}}}([[0.8040095534562728, 0.9746038658113724, 0.8514985044936985, 1.0630703386044618, 0.9872413231121026], [-0.8284103026252243, 0.6828359402192604, -2.2569988528104297, -0.9149446881047588, -0.451877197585612, -0.37842253658503416, -0.7054523013350741, 0.027534404054002634]], AbstractMatrix{Float64}[[0.01 0.0 … 0.0 0.0; 0.0 0.01 … 0.0 0.0; … ; 0.0 0.0 … 0.01 0.0; 0.0 0.0 … 0.0 0.01], [1.0 0.1 … 0.0 0.0; 0.1 1.0 … 0.0 0.0; … ; 0.0 0.0 … 1.0 0.1; 0.0 0.0 … 0.1 1.0]], AbstractMatrix{Float64}[[100.0 0.0 … 0.0 0.0; 0.0 100.0 … 0.0 0.0; … ; 0.0 0.0 … 100.0 0.0; 0.0 0.0 … 0.0 100.0], [1.010205144336438 -0.1020514433643792 … 1.0735488188434786e-6 -1.0735488188434786e-7; -0.10205144336437919 1.020514433643792 … -1.0735488188434786e-5 1.0735488188434786e-6; … ; 1.0735488188434788e-6 -1.0735488188434786e-5 … 1.020514433643792 -0.10205144336437919; -1.073548818843479e-7 1.0735488188434788e-6 … -0.10205144336437919 1.010205144336438]], ["y_91", "z_91"], UnitRange{Int64}[1:5, 6:13]), Observation{Vector{Vector{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{String}, Vector{UnitRange{Int64}}}([[0.958446519780906, 0.9468002552998726, 1.1113861073039302, 1.0677492068113268, 0.8773100774369713], [0.5075781590036329, -0.7916735372771221, 1.0319154542194038, 1.8211232725285496, -0.08922007780470109, -0.4312850958299375, -1.4562086080590868, -0.24697676720827222]], AbstractMatrix{Float64}[[0.01 0.0 … 0.0 0.0; 0.0 0.01 … 0.0 0.0; … ; 0.0 0.0 … 0.01 0.0; 0.0 0.0 … 0.0 0.01], [1.0 0.1 … 0.0 0.0; 0.1 1.0 … 0.0 0.0; … ; 0.0 0.0 … 1.0 0.1; 0.0 0.0 … 0.1 1.0]], AbstractMatrix{Float64}[[100.0 0.0 … 0.0 0.0; 0.0 100.0 … 0.0 0.0; … ; 0.0 0.0 … 100.0 0.0; 0.0 0.0 … 0.0 100.0], [1.010205144336438 -0.1020514433643792 … 1.0735488188434786e-6 -1.0735488188434786e-7; -0.10205144336437919 1.020514433643792 … -1.0735488188434786e-5 1.0735488188434786e-6; … ; 1.0735488188434788e-6 -1.0735488188434786e-5 … 1.020514433643792 -0.10205144336437919; -1.073548818843479e-7 1.0735488188434788e-6 … -0.10205144336437919 1.010205144336438]], ["y_92", "z_92"], UnitRange{Int64}[1:5, 6:13]), Observation{Vector{Vector{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{String}, Vector{UnitRange{Int64}}}([[0.9831064098379927, 0.9624917250746535, 1.0116933070267957, 0.9381569599970345, 1.1707944808798896], [0.502855992128088, 0.1162604789315533, -0.32203179525019615, 0.22142248020300215, 0.17299886671330103, 0.5946668538272681, 0.8323134131917502, 0.448034833118045]], AbstractMatrix{Float64}[[0.01 0.0 … 0.0 0.0; 0.0 0.01 … 0.0 0.0; … ; 0.0 0.0 … 0.01 0.0; 0.0 0.0 … 0.0 0.01], [1.0 0.1 … 0.0 0.0; 0.1 1.0 … 0.0 0.0; … ; 0.0 0.0 … 1.0 0.1; 0.0 0.0 … 0.1 1.0]], AbstractMatrix{Float64}[[100.0 0.0 … 0.0 0.0; 0.0 100.0 … 0.0 0.0; … ; 0.0 0.0 … 100.0 0.0; 0.0 0.0 … 0.0 100.0], [1.010205144336438 -0.1020514433643792 … 1.0735488188434786e-6 -1.0735488188434786e-7; -0.10205144336437919 1.020514433643792 … -1.0735488188434786e-5 1.0735488188434786e-6; … ; 1.0735488188434788e-6 -1.0735488188434786e-5 … 1.020514433643792 -0.10205144336437919; -1.073548818843479e-7 1.0735488188434788e-6 … -0.10205144336437919 1.010205144336438]], ["y_93", "z_93"], UnitRange{Int64}[1:5, 6:13]), Observation{Vector{Vector{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{String}, Vector{UnitRange{Int64}}}([[1.0024805120380706, 1.0288319330960114, 0.8587185774920694, 1.018346704835408, 1.1220030634787803], [0.16967989467650474, -0.5178634384043193, 0.9170787906536308, 0.5643851729012693, 0.16284850867486497, -0.06200752508668591, 0.6224955159020079, -0.015602041559657012]], AbstractMatrix{Float64}[[0.01 0.0 … 0.0 0.0; 0.0 0.01 … 0.0 0.0; … ; 0.0 0.0 … 0.01 0.0; 0.0 0.0 … 0.0 0.01], [1.0 0.1 … 0.0 0.0; 0.1 1.0 … 0.0 0.0; … ; 0.0 0.0 … 1.0 0.1; 0.0 0.0 … 0.1 1.0]], AbstractMatrix{Float64}[[100.0 0.0 … 0.0 0.0; 0.0 100.0 … 0.0 0.0; … ; 0.0 0.0 … 100.0 0.0; 0.0 0.0 … 0.0 100.0], [1.010205144336438 -0.1020514433643792 … 1.0735488188434786e-6 -1.0735488188434786e-7; -0.10205144336437919 1.020514433643792 … -1.0735488188434786e-5 1.0735488188434786e-6; … ; 1.0735488188434788e-6 -1.0735488188434786e-5 … 1.020514433643792 -0.10205144336437919; -1.073548818843479e-7 1.0735488188434788e-6 … -0.10205144336437919 1.010205144336438]], ["y_94", "z_94"], UnitRange{Int64}[1:5, 6:13]), Observation{Vector{Vector{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{String}, Vector{UnitRange{Int64}}}([[0.9508093351154179, 0.9996384834604001, 1.0389616719503503, 0.9991913750679003, 1.1137405923829375], [-0.286421976780856, 0.5475551471055643, 1.8662363003736775, -0.6020388280225553, -1.3178215376291684, -1.7868870807138826, -0.16283493993277004, 1.3777247073123338]], AbstractMatrix{Float64}[[0.01 0.0 … 0.0 0.0; 0.0 0.01 … 0.0 0.0; … ; 0.0 0.0 … 0.01 0.0; 0.0 0.0 … 0.0 0.01], [1.0 0.1 … 0.0 0.0; 0.1 1.0 … 0.0 0.0; … ; 0.0 0.0 … 1.0 0.1; 0.0 0.0 … 0.1 1.0]], AbstractMatrix{Float64}[[100.0 0.0 … 0.0 0.0; 0.0 100.0 … 0.0 0.0; … ; 0.0 0.0 … 100.0 0.0; 0.0 0.0 … 0.0 100.0], [1.010205144336438 -0.1020514433643792 … 1.0735488188434786e-6 -1.0735488188434786e-7; -0.10205144336437919 1.020514433643792 … -1.0735488188434786e-5 1.0735488188434786e-6; … ; 1.0735488188434788e-6 -1.0735488188434786e-5 … 1.020514433643792 -0.10205144336437919; -1.073548818843479e-7 1.0735488188434788e-6 … -0.10205144336437919 1.010205144336438]], ["y_95", "z_95"], UnitRange{Int64}[1:5, 6:13]), Observation{Vector{Vector{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{String}, Vector{UnitRange{Int64}}}([[1.0751125013469705, 1.0053493255492978, 0.9562242463844997, 0.9963588101093321, 1.0340435162105903], [0.9768411273553772, 0.45380705709411795, -2.1742852509248074, -1.7268782525740607, -1.5027713562307037, -0.36490659360377337, -0.41129860015120323, -0.5212412653239702]], AbstractMatrix{Float64}[[0.01 0.0 … 0.0 0.0; 0.0 0.01 … 0.0 0.0; … ; 0.0 0.0 … 0.01 0.0; 0.0 0.0 … 0.0 0.01], [1.0 0.1 … 0.0 0.0; 0.1 1.0 … 0.0 0.0; … ; 0.0 0.0 … 1.0 0.1; 0.0 0.0 … 0.1 1.0]], AbstractMatrix{Float64}[[100.0 0.0 … 0.0 0.0; 0.0 100.0 … 0.0 0.0; … ; 0.0 0.0 … 100.0 0.0; 0.0 0.0 … 0.0 100.0], [1.010205144336438 -0.1020514433643792 … 1.0735488188434786e-6 -1.0735488188434786e-7; -0.10205144336437919 1.020514433643792 … -1.0735488188434786e-5 1.0735488188434786e-6; … ; 1.0735488188434788e-6 -1.0735488188434786e-5 … 1.020514433643792 -0.10205144336437919; -1.073548818843479e-7 1.0735488188434788e-6 … -0.10205144336437919 1.010205144336438]], ["y_96", "z_96"], UnitRange{Int64}[1:5, 6:13]), Observation{Vector{Vector{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{String}, Vector{UnitRange{Int64}}}([[0.8352740016937954, 1.1139446936004458, 1.1400905236038912, 0.956894164259069, 0.9536121673176932], [-2.462348128089621, -1.1725527438685086, 0.14278448611138095, -0.6439333502006401, -0.9502058281978379, -1.4203673168380944, 0.9233169387256974, -0.24114350560148398]], AbstractMatrix{Float64}[[0.01 0.0 … 0.0 0.0; 0.0 0.01 … 0.0 0.0; … ; 0.0 0.0 … 0.01 0.0; 0.0 0.0 … 0.0 0.01], [1.0 0.1 … 0.0 0.0; 0.1 1.0 … 0.0 0.0; … ; 0.0 0.0 … 1.0 0.1; 0.0 0.0 … 0.1 1.0]], AbstractMatrix{Float64}[[100.0 0.0 … 0.0 0.0; 0.0 100.0 … 0.0 0.0; … ; 0.0 0.0 … 100.0 0.0; 0.0 0.0 … 0.0 100.0], [1.010205144336438 -0.1020514433643792 … 1.0735488188434786e-6 -1.0735488188434786e-7; -0.10205144336437919 1.020514433643792 … -1.0735488188434786e-5 1.0735488188434786e-6; … ; 1.0735488188434788e-6 -1.0735488188434786e-5 … 1.020514433643792 -0.10205144336437919; -1.073548818843479e-7 1.0735488188434788e-6 … -0.10205144336437919 1.010205144336438]], ["y_97", "z_97"], UnitRange{Int64}[1:5, 6:13]), Observation{Vector{Vector{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{String}, Vector{UnitRange{Int64}}}([[0.9918168531858406, 1.0034402632652009, 1.0370674371155753, 0.9753257976063444, 1.1088545998785602], [-0.3382270150678709, -1.3743580106659032, -1.896506757596769, -0.2577282039632728, -0.981061020988935, -1.220685146910769, -2.3049181469002766, -2.1043950602955626]], AbstractMatrix{Float64}[[0.01 0.0 … 0.0 0.0; 0.0 0.01 … 0.0 0.0; … ; 0.0 0.0 … 0.01 0.0; 0.0 0.0 … 0.0 0.01], [1.0 0.1 … 0.0 0.0; 0.1 1.0 … 0.0 0.0; … ; 0.0 0.0 … 1.0 0.1; 0.0 0.0 … 0.1 1.0]], AbstractMatrix{Float64}[[100.0 0.0 … 0.0 0.0; 0.0 100.0 … 0.0 0.0; … ; 0.0 0.0 … 100.0 0.0; 0.0 0.0 … 0.0 100.0], [1.010205144336438 -0.1020514433643792 … 1.0735488188434786e-6 -1.0735488188434786e-7; -0.10205144336437919 1.020514433643792 … -1.0735488188434786e-5 1.0735488188434786e-6; … ; 1.0735488188434788e-6 -1.0735488188434786e-5 … 1.020514433643792 -0.10205144336437919; -1.073548818843479e-7 1.0735488188434788e-6 … -0.10205144336437919 1.010205144336438]], ["y_98", "z_98"], UnitRange{Int64}[1:5, 6:13]), Observation{Vector{Vector{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{String}, Vector{UnitRange{Int64}}}([[0.8583909244230467, 0.9107249884052598, 0.9066624750469077, 0.7842252604259856, 0.9769847888817218], [0.04999635334846051, -0.015318688306279454, 0.6559250125184406, -0.8262564846340366, -0.16125240407643338, -0.3018273756872959, -0.12189767233846083, 1.5383396771206774]], AbstractMatrix{Float64}[[0.01 0.0 … 0.0 0.0; 0.0 0.01 … 0.0 0.0; … ; 0.0 0.0 … 0.01 0.0; 0.0 0.0 … 0.0 0.01], [1.0 0.1 … 0.0 0.0; 0.1 1.0 … 0.0 0.0; … ; 0.0 0.0 … 1.0 0.1; 0.0 0.0 … 0.1 1.0]], AbstractMatrix{Float64}[[100.0 0.0 … 0.0 0.0; 0.0 100.0 … 0.0 0.0; … ; 0.0 0.0 … 100.0 0.0; 0.0 0.0 … 0.0 100.0], [1.010205144336438 -0.1020514433643792 … 1.0735488188434786e-6 -1.0735488188434786e-7; -0.10205144336437919 1.020514433643792 … -1.0735488188434786e-5 1.0735488188434786e-6; … ; 1.0735488188434788e-6 -1.0735488188434786e-5 … 1.020514433643792 -0.10205144336437919; -1.073548818843479e-7 1.0735488188434788e-6 … -0.10205144336437919 1.010205144336438]], ["y_99", "z_99"], UnitRange{Int64}[1:5, 6:13]), Observation{Vector{Vector{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{AbstractMatrix{Float64}}, Vector{String}, Vector{UnitRange{Int64}}}([[1.1586884599775251, 0.9668871552058477, 1.1283936791489186, 1.1207367135019248, 0.8677203822044445], [-0.971762127418628, -0.6970771309401108, 1.1224705718118593, 0.02373322333366057, 1.2492819769168302, 0.31074805825995444, 0.35197562206343985, -1.2677416620627595]], AbstractMatrix{Float64}[[0.01 0.0 … 0.0 0.0; 0.0 0.01 … 0.0 0.0; … ; 0.0 0.0 … 0.01 0.0; 0.0 0.0 … 0.0 0.01], [1.0 0.1 … 0.0 0.0; 0.1 1.0 … 0.0 0.0; … ; 0.0 0.0 … 1.0 0.1; 0.0 0.0 … 0.1 1.0]], AbstractMatrix{Float64}[[100.0 0.0 … 0.0 0.0; 0.0 100.0 … 0.0 0.0; … ; 0.0 0.0 … 100.0 0.0; 0.0 0.0 … 0.0 100.0], [1.010205144336438 -0.1020514433643792 … 1.0735488188434786e-6 -1.0735488188434786e-7; -0.10205144336437919 1.020514433643792 … -1.0735488188434786e-5 1.0735488188434786e-6; … ; 1.0735488188434788e-6 -1.0735488188434786e-5 … 1.020514433643792 -0.10205144336437919; -1.073548818843479e-7 1.0735488188434788e-6 … -0.10205144336437919 1.010205144336438]], ["y_100", "z_100"], UnitRange{Int64}[1:5, 6:13])], RandomFixedSizeMinibatcher{String, Random.TaskLocalRNG, Vector{Vector{Int64}}}(5, "extend", Random.TaskLocalRNG(), [[17, 15, 49, 31, 63], [60, 2, 81, 92, 75], [58, 14, 95, 66, 76], [38, 79, 8, 48, 53], [36, 27, 94, 23, 25], [91, 70, 33, 52, 16], [98, 85, 6, 61, 86], [20, 82, 26, 68, 7], [44, 90, 69, 24, 74], [97, 59, 65, 34, 50], [45, 57, 37, 78, 22], [11, 77, 13, 56, 12], [10, 99, 1, 71, 32], [72, 51, 100, 41, 67], [87, 39, 43, 89, 83], [80, 62, 40, 29, 3], [96, 35, 46, 88, 19], [64, 30, 9, 84, 54], [5, 47, 4, 55, 93], [73, 18, 21, 28, 42]]), ["series_1", "series_2", "series_3", "series_4", "series_5", "series_6", "series_7", "series_8", "series_9", "series_10"  …  "series_91", "series_92", "series_93", "series_94", "series_95", "series_96", "series_97", "series_98", "series_99", "series_100"], Dict("minibatch" => 1, "epoch" => 1), [[[17, 15, 49, 31, 63], [60, 2, 81, 92, 75], [58, 14, 95, 66, 76], [38, 79, 8, 48, 53], [36, 27, 94, 23, 25], [91, 70, 33, 52, 16], [98, 85, 6, 61, 86], [20, 82, 26, 68, 7], [44, 90, 69, 24, 74], [97, 59, 65, 34, 50], [45, 57, 37, 78, 22], [11, 77, 13, 56, 12], [10, 99, 1, 71, 32], [72, 51, 100, 41, 67], [87, 39, 43, 89, 83], [80, 62, 40, 29, 3], [96, 35, 46, 88, 19], [64, 30, 9, 84, 54], [5, 47, 4, 55, 93], [73, 18, 21, 28, 42]]], "optional metadata information in any format")
get_metadata(observation_series) # returns metadata
"optional metadata information in any format"
# some example methods to get information out at the current minibatch:
get_current_minibatch(observation_series) # returns [i₁, ..., i₅],  the current minibatch subset of indices 1:100
5-element Vector{Int64}:
 17
 15
 49
 31
 63
get_obs(observation_series) # returns [yi₁, zi₁, ..., yi₅, zi₅], the data sample for the current minibatch
65-element Vector{Float64}:
  0.8524719438380768
  1.048986861619304
  0.8767613802551497
  0.9772820587544845
  0.9522038288004747
  0.05304029399254148
  0.32779908598703433
 -0.6571334938344605
  0.18039505725520913
 -0.529277113038114
  ⋮
  0.9438007768801395
 -0.156001598793755
 -0.4316765200776378
  0.4764747726938942
 -0.08938293967076946
  1.4871130585661827
 -0.7119529273481919
 -0.4655632462122585
 -0.1900163263328813
get_obs_noise_cov(observation_series) # returns block-diagonal matrix with blocks [cov_yi₁  0 ... 0 ; 0 cov_zi₁ 0 ... 0; ... ; 0 ... 0 cov_yi₅ 0; 0 ... 0 cov_zi₅]
65×65 Matrix{Float64}:
 0.01  0.0   0.0   0.0   0.0   0.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0   0.01  0.0   0.0   0.0   0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0   0.0   0.01  0.0   0.0   0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0   0.0   0.0   0.01  0.0   0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0   0.0   0.0   0.0   0.01  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0   0.0   0.0   0.0   0.0   1.0  0.1  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0   0.0   0.0   0.0   0.0   0.1  1.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0   0.0   0.0   0.0   0.0   0.0  0.1     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0   0.0   0.0   0.0   0.0   0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0   0.0   0.0   0.0   0.0   0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 ⋮                             ⋮         ⋱            ⋮                   
 0.0   0.0   0.0   0.0   0.0   0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0   0.0   0.0   0.0   0.0   0.0  0.0     0.1  0.0  0.0  0.0  0.0  0.0  0.0
 0.0   0.0   0.0   0.0   0.0   0.0  0.0     1.0  0.1  0.0  0.0  0.0  0.0  0.0
 0.0   0.0   0.0   0.0   0.0   0.0  0.0     0.1  1.0  0.1  0.0  0.0  0.0  0.0
 0.0   0.0   0.0   0.0   0.0   0.0  0.0  …  0.0  0.1  1.0  0.1  0.0  0.0  0.0
 0.0   0.0   0.0   0.0   0.0   0.0  0.0     0.0  0.0  0.1  1.0  0.1  0.0  0.0
 0.0   0.0   0.0   0.0   0.0   0.0  0.0     0.0  0.0  0.0  0.1  1.0  0.1  0.0
 0.0   0.0   0.0   0.0   0.0   0.0  0.0     0.0  0.0  0.0  0.0  0.1  1.0  0.1
 0.0   0.0   0.0   0.0   0.0   0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.1  1.0

minibatches are updated internally to the update_ensemble!(ekp,...) step via a call to

update_minibatch!(observation_series)
get_current_minibatch(observation_series)
5-element Vector{Int64}:
 60
  2
 81
 92
 75

Minibatchers

Minibatchers are modular and must be derived from the Minibatcher abstract type. They contain a method create_new_epoch!(minibatcher,args...;kwargs) that creates a sampling of an epoch of data. For example, if we have 100 data observations, the epoch is 1:100, and one possible minibatching is a random partitioning of 1:100 into a batch-size (e.g., 5) leading to 20 minibatches.

Some of the implemented Minibatchers

  • FixedMinibatcher(given_batches, "order"), (default method = "order"), minibatches are fixed and run through in order for all epochs
  • FixedMinibatcher(given_batches, "random"), minibatches are fixed, but are run through in a randomly chosen order in each epoch
  • no_minibatcher(size), creates a FixedMinibatcher with just one batch which is the epoch (effectively no minibatching)
  • RandomFixedSizeMinibatcher(minibatch_size, "trim"), (default method = "trim") creates minibatches of size minibatch_size by randomly sampling the epoch, if the minibatch size does not divide into the number of samples it will ignore the remainder (and thus preserving a constant batch size)
  • RandomFixedSizeMinibatcher(minibatch_size, "extend"), creates minibatches of size minibatch_size by randomly sampling the epoch, if the minibatch size does not divide into the number of samples it will include the remainder in the final batch (and thus will cover the entirety of the data, with a larger final batch)

Identifiers

One can additionally provide a vector of names to name each Observation in the ObservationSeries by giving using the Dict entry "names" => names.

To think about the differences between the identifiers for Observation and ObservationSeries consider an application of observing the average state of a dynamical system over 100 time windows. The time windows will be batched over during the calibration.

The compiled information is given in the object:

yz_observation_series::ObservationSeries

As this contains many time windows, setting the names of the ObservationSeries objects to index the time window is a sensible identifier, for example,

get_names(yz_observation_series)
> ["window_1", "window_2", ..., "window_100"]

The individual Observations should refer only to the state being measured, so suitable identifiers might be, for example,

obs = get_observations(yz_observation_series)[1] # get first observation in the series
get_names(obs)
> ["y_window_average", "z_window_average"]

Building the noise covariances

For most low-dimensional problems (e.g. dim < 5000), the user can simply provide a UniformScaling, or an AbstractMatrix as they are most familiar, in conjunction with any EKP process.

For high-dimensional problems, the user must first select an output-scalable process. For example, TransformInversion(...) (ETKI) or TransformUnscented(...) (UTKI)

Next the user must select a scalable storage for the noise covariance matrix in observations as the operational cost of storing and updating very large (non-diagonal) AbstractMatrix objects is prohibitive.

Therefore in high-dimensions we recommend the following scalable options:

  • For a diagonal covariance: Diagonal or UniformScaling
  • For a low-rank covariance: SVD
  • For a sum of low-rank and diagonal covariance: SVDplusD

The framework is extensible to new types as they arise (so long as one can define an efficient implementation of left multiplication and inverse of the struct)

Example of building scalable high-dimensional ObservationSeries

The following example demonstrates our utilities tsvd_cov_from_samples to quickly build such forms from available samples.

Imagine a problem where the observation dimension is size $10^6$, and we have 30 noisy samples of such data from repeated runs of an experiment. We also believe that there may also be some additional 5% noise from model error when fitting our parameters to data, as our model is also not perfect. Let's build some observations!

using EnsembleKalmanProcesses 
using LinearAlgebra
using Statistics

# "data"
n_trials = 30
output_dim = 1_000_000
Y = randn(output_dim, n_trials);

# the noise estimated from the samples (will have rank n_trials-1)
internal_cov = tsvd_cov_from_samples(Y); # SVD object

# the "5%" model error (diagonal)
model_error_frac = 0.05
data_mean = vec(mean(Y,dims=2));
model_error_cov = Diagonal((model_error_frac*data_mean).^2);

# Combine these
covariance = SVDplusD(internal_cov, model_error_cov);

Y_obs_vec = [];
for k = 1:n_trials
    push!(Y_obs_vec, Observation(
            Dict(
                "samples" => Y[:,k],
                "covariances" => covariance,
                "names" => "experiment_$k",
            ),
        ),  
    )
end 

Let's assume that we then wish to update this with specific batches of size 2, in order. Let's build an ObservationSeries!

b_size = 2;
given_batches = [collect(((i - 1) * b_size + 1):(i * b_size)) for i in 1:n_trials];

minibatcher = FixedMinibatcher(given_batches);
observation_series = ObservationSeries(Y_obs_vec, minibatcher);

This can be passed into a scalable EKI (with some prior)

using EnsembleKalmanProcesses.ParameterDistributions
prior = constrained_gaussian("example_params", 3, 2, 0, Inf, repeats=3) ;

utki = EnsembleKalmanProcess(observation_series, TransformUnscented(prior));
Warning

Always extract the current observational noise from ekp with get_obs_noise_cov(ekp, build=false). Using build=true (default) will cause memory issues in this case as it will try to build the compactly-stored matrix.