CalibrationTools

This module contains utilities to help with setting up calibration experiments of the coupled model. These utilities are meant to be broadly useful and are not specific to one particular calibration experiment.

Other helpful resources

For calibration, other resources that you may find helpful are the documentation for EnsembleKalmanProcesses, ClimaCalibrate, ClimaAnalysis, and the ClimaCoupler calibration experiments.

Data loading

OutputVar

This section assumes you are familiar with ClimaAnalysis.OutputVar.

CalibrationTools provides a variety of data loaders from different artifacts in ClimaArtifacts for calibration. For example, the ERA5DataLoader is a data loader loading preprocessed ERA5 data. This data loader automatically applies preprocessing to make it convenient to use for calibration. See the documentation for ERA5DataLoader for details on the preprocessing steps applied.

You can retrieve a variable with get and get a set of all available preprocessed variables with available_vars.

julia> import ClimaCoupler: CalibrationTools
julia> data_loader = CalibrationTools.ERA5DataLoader()ERA5DataLoader: hfls, hfss, rlus, rsus
julia> CalibrationTools.available_vars(data_loader)Set{String} with 4 elements: "hfls" "rlus" "hfss" "rsus"
julia> var = get(data_loader, "rsus");

In the example, we retrieve a OutputVar with the short name rsus which represents the mean surface upward short-wave radiation flux.

Other data loaders

If you want a data loader for other data sources, then please open an issue for it!

I want to add a new variable to an existing data loader

To add a new variable, you must

  1. Define a mapping between the ERA5 name and CliMA name,
  2. Define a preprocess function for this variable.

To determine which variables are already available, refer to the artifact's documentation. For ERA5DataLoader we can load the variable representing mean evaporation rate or mer from the data source. We also want to give it the name er. For step 1, we add "mer" => "er" as a mapping for the data loader to recognize.

import ClimaCoupler: CalibrationTools
data_loader = CalibrationTools.ERA5DataLoader()
# ERA5_TO_CLIMA_NAMES define the existing pairings for the data loader
era5_to_clima_names = [CalibrationTools.ERA5_TO_CLIMA_NAMES..., "mer" => "er"]
data_loader = CalibrationTools.ERA5DataLoader(; era5_to_clima_names)
ERA5DataLoader: er, hfls, hfss, rlus, rsus

For the second step, we define a preprocessing function specific to the variable.

Preprocessing functions

See ClimaAnalysis documentation for available transformations on OutputVars.

In our example, no preprocessing is applied.

CalibrationTools.preprocess(::CalibrationTools.ERA5DataLoader, var, ::Val{:er}) = var

Now, you can use get to retrieve the OutputVar with the short name "mer"`.

data_loader = CalibrationTools.ERA5DataLoader(; era5_to_clima_names)
get(data_loader, "er")

CalibrationTools API

ClimaCoupler.CalibrationTools.CalibrateConfigType
struct CalibrateConfig{SPINUP <: Dates.Period, EXTEND <: Dates.Period}

A configuration struct for keeping track of multiple fields that are of interest to a user running calibration, or that are needed in multiple places (e.g., for ensemble members and generating observations).

source
ClimaCoupler.CalibrationTools.CalibrateConfigMethod
CalibrateConfig(;
    config_file,
    short_names::Vector{String},
    minibatch_size::Integer,
    n_iterations::Integer,
    sample_date_ranges,
    extend::Dates.Period,
    spinup::Dates.Period,
    output_dir,
    rng_seed = 42,
)

Initializes a CalibrateConfig which contains values needed in multiple places during calibration.

Keyword arguments

  • config_file: Configuration file to use for ClimaCoupler simulation.

  • short_names: Short names of the observations.

  • minibatch_size: The size of the minibatch for each iteration.

  • n_iterations: The number of iterations to run the calibration for.

  • sample_date_ranges: The date ranges for each sample. The dates should be the same as found in the time series data of the observations.

  • extend: The amount of time to run the simulation after the end date determined by sample_date_ranges. For seasonal averages, extend should be Dates.Month(3) and for monthly averages, extend should be Dates.Month(1).

  • spinup: The amount of time to run the simulation before the start date determined by sample_date_ranges.

  • output_dir: The location to save the calibration at.

  • rng_seed: An integer to ensure that calibration runs with the same settings are the same.

source
ClimaCoupler.CalibrationTools.ERA5DataLoaderMethod
ERA5DataLoader(; era5_to_clima_names = ERA5_TO_CLIMA_NAMES)

Construct a data loader which you can used to load preprocessed ERA5 monthly time-averaged data in OutputVar, where

  • the short name, sign of the data, and units match CliMA conventions
  • the latitudes are shifted to be -180 to 180 degrees,
  • the times are at the start of the time period (e.g. the time average of January is on the first of January instead of January 15th),
  • units match the variables in the output of the CliMA diagnostics.

The ERA5 data comes from the era5_monthly_averages_surface_single_level_1979_2024 artifact. See ClimaArtifacts for more information about this artifact.

The keyword argument era5_to_clima_names is a vector of pairs mapping ERA5 name to CliMA name.

source
ClimaCoupler.CalibrationTools.CERESDataLoaderMethod
CERESDataLoader(; ceres_to_clima_names = CERES_TO_CLIMA_NAMES)

Construct a data loader which you can used to load preprocessed CERES monthly time-averaged TOA radiation data in OutputVar, where

  • the short name and units match CliMA conventions,
  • the latitudes are shifted to be -180 to 180 degrees,
  • the times are at the start of the time period (e.g. the time average of January is on the first of January instead of January 15th).

In addition to the variables in CERES_TO_CLIMA_NAMES, the following derived cloud radiative effect (CRE) variables are available:

  • swcre = rsutcs - rsut (shortwave cloud radiative effect)
  • lwcre = rlutcs - rlut (longwave cloud radiative effect)

The CERES data comes from the radiation_obs artifact. See ClimaArtifacts for more information about this artifact.

The keyword argument ceres_to_clima_names is a vector of pairs mapping CERES name to CliMA name.

source
ClimaCoupler.CalibrationTools.GPCPDataLoaderMethod
GPCPDataLoader(; gpcp_to_clima_names = GPCP_TO_CLIMA_NAMES)

Construct a data loader which you can used to load preprocessed GPCP monthly time-averaged precipitation data in OutputVar, where

  • the short name and sign of the data match CliMA conventions
  • the latitudes are shifted to be -180 to 180 degrees,
  • the times are at the start of the time period (e.g. the time average of January is on the first of January instead of January 15th).

The ERA5 data comes from the precipitation_obs artifact. See ClimaArtifacts for more information about this artifact.

Note that the units of precipitation or pr are mm/day from this dataset. In CliMA, the units of pr are kg m^-2 s^-1.

The keyword argument gpcp_to_clima_names is a vector of pairs mapping GPCP name to CliMA name.

source
ClimaCoupler.CalibrationTools.ERA5PressureLevelDataLoaderMethod
ERA5PressureLevelDataLoader(;
    era5_pressure_level_to_clima_names = ERA5_PRESSURE_LEVEL_TO_CLIMA_NAMES)

Construct a data loader which you can used to load preprocessed monthly time-averaged ERA5 data on pressure levels in OutputVar, where

  • the short name and sign of the data match CliMA conventions,
  • the latitudes are shifted to be -180 to 180 degrees,
  • the times are at the start of the time period (e.g. the time average of January is on the first of January instead of January 15th).

Note that the units of unitless OutputVars are "unitless" rather than the empty string as ClimaAnalysis consider the empty string as missing units.

The ERA5 data comes from the era5_monthly_averages_pressure_levels_1979_2024 artifact. See ClimaArtifacts for more information about this artifact.

The keyword argument era5_pressure_level_to_clima_names is a vector of pairs mapping ERA5 name to CliMA name.

source
ClimaCoupler.CalibrationTools.ModisDataLoaderMethod
ModisDataLoader(;
    modis_to_clima_names = MODIS_TO_CLIMA_NAMES,
)

Construct a data loader which you can used to load preprocessed monthly time-averaged MODIS data on single levels in OutputVar, where

  • the short name and sign of the data match CliMA conventions,
  • the latitudes are shifted to be -180 to 180 degrees,
  • the times are at the start of the time period (e.g. the time average of January is on the first of January instead of January 15th).

For lwp and clivi, there are NaNs in the data. These NaNs was imputted with mean of the non-NaN data for the dataset.

The MODIS data comes from the modis_lwp_iwp artifact. See ClimaArtifacts for more information about this artifact.

The keyword argument modis_to_clima_names is a vector of pairs mapping MODIS name to CliMA name.

source
ClimaCoupler.CalibrationTools.CompositeDataLoaderMethod
CompositeDataLoader(loaders::AbstractDataLoader...; varname_to_loader = Dict())

Construct a CompositeDataLoader from multiple data loaders.

The keyword argument varname_to_loader is a dictionary mapping variable names to AbstractDataLoaders. When multiple data loaders provide the same variable, use varname_to_loader to specify which loader to use for each variable. If a variable name is not provided in varname_to_loader and multiple loaders provide the same variable, an error is thrown.

See the example below for how to use CompositeDataLoader.

composite_data_loader = CompositeDataLoader(ERA5DataLoader(), CERESDataLoader())

# If pr is added to ERA5DataLoader, then you can specify to load pr from
# GPCPDataLoader
era5_data_loader = ERA5DataLoader()
gpcp_data_loader = GPCPDataLoader()
composite_data_loader = CompositeDataLoader(
    era5_data_loader,
    gpcp_data_loader;
    varname_to_loader = Dict("pr" => gpcp_data_loader)
)
source
Base.getFunction
get(loader::CompositeDataLoader, short_name::String)

Get the preprocessed OutputVar with the name short_name.

source
get(loader::ERA5DataLoader, short_name::String)

Get the preprocessed OutputVar with the name short_name from the ERA5 dataset.

source
get(loader::CERESDataLoader, short_name::String)

Get the preprocessed OutputVar with the name short_name from the CERES dataset.

source
get(loader::GPCPDataLoader, short_name::String)

Get the preprocessed OutputVar with the name short_name from the GPCP dataset.

source
get(loader::ERA5PressureLevelDataLoader, short_name::String)

Get the preprocessed OutputVar with the name short_name from the ERA5 pressure levels dataset.

source
get(loader::ModisDataLoader, short_name::String)

Get the preprocessed OutputVar with the name short_name from the MODIS dataset.

source
get(loader::CalipsoDataLoader, short_name::String)

Get the preprocessed OutputVar with the name short_name from the CALIPSO/CloudSat dataset.

source
ClimaCoupler.CalibrationTools.update_timespan!Function
update_timespan!(
    config_dict,
    start_date::Dates.DateTime,
    end_date::Dates.DateTime
)

Update "startdate" and "tend" in config_dict to match start_date and end_date.

The start_date and end_date are converted to strings and the keys "startdate" and "tend" in config_dict are updated accordingly. Note that any precision beyond days (e.g. hours, seconds, etc.) are not used for setting the start date.

source