CalibrationTools

This module contains utilities to help with setting up calibration experiments of the coupled model. These utilities are meant to be broadly useful and are not specific to one particular calibration experiment.

Other helpful resources

For calibration, other resources that you may find helpful are the documentation for EnsembleKalmanProcesses, ClimaCalibrate, ClimaAnalysis, and the ClimaCoupler calibration experiments.

Data loading

OutputVar

This section assumes you are familiar with ClimaAnalysis.OutputVar.

As of now, CalibrationTools provide a single data loader which is the ERA5DataLoader for loading preprocessed ERA5 data. This data loader automatically applies preprocessing to make it convenient to use for calibration. See the documentation for ERA5DataLoader for details on the preprocessing steps applied.

You can retrieve a variable with get and get a set of all available preprocessed variables with available_vars.

julia> import ClimaCoupler: CalibrationTools
julia> data_loader = CalibrationTools.ERA5DataLoader()ERA5DataLoader: hfls, hfss, rlus, rsus
julia> CalibrationTools.available_vars(data_loader)Set{String} with 4 elements: "hfls" "rlus" "hfss" "rsus"
julia> var = get(data_loader, "rsus");

In the example, we retrieve a OutputVar with the short name rsus which represents the mean surface upward short-wave radiation flux.

Other data loaders

If you want a data loader for other data sources, then please open an issue for it!

I want to add a new variable to an existing data loader

To add a new variable, you must

  1. Define a mapping between the ERA5 name and CliMA name,
  2. Define a preprocess function for this variable.

To determine which variables are already available, refer to the artifact's documentation. For ERA5DataLoader we can load the variable representing mean evaporation rate or mer from the data source. We also want to give it the name er. For step 1, we add "mer" => "er" as a mapping for the data loader to recognize.

import ClimaCoupler: CalibrationTools
data_loader = CalibrationTools.ERA5DataLoader()
# ERA5_TO_CLIMA_NAMES define the existing pairings for the data loader
era5_to_clima_names = [CalibrationTools.ERA5_TO_CLIMA_NAMES..., "mer" => "er"]
data_loader = CalibrationTools.ERA5DataLoader(; era5_to_clima_names)
ERA5DataLoader: er, hfls, hfss, rlus, rsus

For the second step, we define a preprocessing function specific to the variable.

Preprocessing functions

See ClimaAnalysis documentation for available transformations on OutputVars.

In our example, no preprocessing is applied.

CalibrationTools.preprocess(::CalibrationTools.ERA5DataLoader, var, ::Val{:er}) = var

Now, you can use get to retrieve the OutputVar with the short name "mer"`.

data_loader = CalibrationTools.ERA5DataLoader(; era5_to_clima_names)
get(data_loader, "er")

CalibrationTools API

ClimaCoupler.CalibrationTools.CalibrateConfigType
struct CalibrateConfig{SPINUP <: Dates.Period, EXTEND <: Dates.Period}

A configuration struct for keeping track of multiple fields that are of interest to a user running calibration, or that are needed in multiple places (e.g., for ensemble members and generating observations).

source
ClimaCoupler.CalibrationTools.CalibrateConfigMethod
CalibrateConfig(;
    config_file,
    short_names::Vector{String},
    minibatch_size::Integer,
    n_iterations::Integer,
    sample_date_ranges,
    extend::Dates.Period,
    spinup::Dates.Period,
    output_dir,
    rng_seed = 42,
)

Initializes a CalibrateConfig which contains values needed in multiple places during calibration.

Keyword arguments

  • config_file: Configuration file to use for ClimaCoupler simulation.

  • short_names: Short names of the observations.

  • minibatch_size: The size of the minibatch for each iteration.

  • n_iterations: The number of iterations to run the calibration for.

  • sample_date_ranges: The date ranges for each sample. The dates should be the same as found in the time series data of the observations.

  • extend: The amount of time to run the simulation after the end date determined by sample_date_ranges. For seasonal averages, extend should be Dates.Month(3) and for monthly averages, extend should be Dates.Month(1).

  • spinup: The amount of time to run the simulation before the start date determined by sample_date_ranges.

  • output_dir: The location to save the calibration at.

  • rng_seed: An integer to ensure that calibration runs with the same settings are the same.

source
ClimaCoupler.CalibrationTools.ERA5DataLoaderMethod
ERA5DataLoader(; era5_to_clima_names = ERA5_TO_CLIMA_NAMES)

Construct a data loader which you can load preprocessed ERA5 monthly time-averaged data in OutputVar, where

  • the short name, sign of the data, and units match CliMA conventions
  • the latitudes are shifted to be -180 to 180 degrees,
  • the times are at the start of the time period (e.g. the time average of January is on the first of January instead of January 15th),
  • units match the variables in the output of the CliMA diagnostics.

The ERA5 data comes from the era5_monthly_averages_surface_single_level_1979_2024 artifact. See ClimaArtifacts for more information about this artifact.

The keyword argument era5_to_clima_names is a vector of pairs mapping ERA5 name to CliMA name.

source
Base.getMethod
get(loader::ERA5DataLoader, short_name)

Get the preprocessed OutputVar with the name short_name from the ERA5 dataset.

source