Single Column Models

ClimaAtmos.jl supports several canonical test cases that are run in a single column model designed to verify how PROSPECT (EDMF) performs against other test cases. These cases include variants of bomex, dycoms, rico, soares, and trmm and can be found in the config/model_configs directory. To run, for example, the BOMEX test case execute the following:

julia --project=examples examples/hybrid/driver.jl --config_file config/model_configs/prognostic_edmfx_bomex_column.yml --job_id bomex

It may also be helpful to run in interactive mode to be able to examine the simulation object, debug, and develop the code further. To enter debug mode run julia --project=examples and then in the REPL run:

using Revise
import ClimaAtmos as CA

# get the configuration arguments
simulation = CA.AtmosSimulation("config/model_configs/prognostic_edmfx_bomex_column.yml")
sol_res = CA.solve_atmos!(simulation) # run the simulation

Externally-Driven Single Column Models

Currently two versions of the externally driven single column model, GCM driven and ReanalysisTimeVarying are supported in ClimaAtmos.jl. They have been developed specifically for the purpose of realistic simulation and model calibration. Externally-driven means that the model is initialized and forced with data coming from a different simulation. This differs from setups like, for example, BOMEX or SOARES which have steady forcing or functional forcing, respectively.

GCM-Driven Case

For the GCM driven case we can run the experiment using the config file config/model_configs/prognostic_edmfx_gcmdriven_column.yml by running:

julia --project=examples examples/hybrid/driver.jl --config_file config/model_configs/prognostic_edmfx_gcmdriven_column.yml --job_id gcm_driven_scm

In the config the following settings are particularly important:

initial_condition: "GCM"
external_forcing: "GCM"
external_forcing_file: artifact"cfsite_gcm_forcing"/HadGEM2-A_amip.2004-2008.07.nc
cfsite_number : "site23"
surface_setup: "GCM"

Here we must set all of initial_condition, external_forcing and surface_setup to be GCM as each component requires information from the external file. The external_forcing_file and cfsite_number together determine the temperature, specific humidity, and wind as well as horizontal and vertical advection profiles that drive the simulation, and can be set to a local file path as opposed to using the artifact. Radiation and surface temperature are also specified. Here the forcing file, an example of which is stored in the artifact, contains groups for each cfsite to drive the simulation. See Shen et al. 2022 for more information.

Reanalysis-Driven Case

The ReanalysisTimeVarying case extends the GCM driven case by providing support for single-column simulations which resolve the diurnal cycle, can be run at any site globally, and use reanalysis to drive the simulation, allowing for calibration of EDMF to earth-system observations in the single-column setting. This feature was found to be needed to address biases in calibration arising from correlation between time-of-day and cloud liquid water path over the tropical Pacific. For this simulation we again highlight similar arguments in the config file:

initial_condition: "ReanalysisTimeVarying"
external_forcing: "ReanalysisTimeVarying"
surface_setup: "ReanalysisTimeVarying"
surface_temperature: "ReanalysisTimeVarying"
start_date: "20070701"
site_latitude: 17.0
site_longitude: -149.0

By this point, the first 4 entries are intuitive. We need to dispatch over each of these methods to setup the forcing for each component of the model. To obtain the observations, now note that instead of directly specifying a file we must specify a start_date, site_latitude, and site_longitude. This is because we now use ClimaArtifacts.jl to store data to ensure reproducibility of our simulation and results. The data is generated by downloading from ECMWF and further documentation for ERA5 data download can be found either directly on the ECMWF page and ClimaArtifacts.jl. Note that the profiles, surface temperature, and surface fluxes cannot be obtained from a single request and so together we need 3 files for all the data. We include a script at src/utils/era5_observations_to_forcing_file.jl which extracts the profiles and computes the tendencies needed for the simulation from the raw ERA5 reanalysis files. We store the observations directly into an artifact era5_hourly_atmos_processed to eliminate the need to reprocess specific sites and locations. This setup means that users are free to choose sites globally at any time at which ERA5 data is available. Unfortunately, global hourly renanalysis is too large to store in an artifact and so we have currently only provided support for the first 5 days of July 2007 in the tropical Pacific, stored in era5_hourly_atmos_raw, only available on the clima and Caltech HPC servers.

Running the Reanalysis-driven case at different times and locations

You need 3 separate files with specific variables and naming convention for the data processing script to work.

  1. Hourly profiles with variables, following ERA5 naming convention, including t, q, u, v, w, z, clwc, ciwc. This file should be stored in the appropriate artifacts directory with the following naming scheme "forcing_and_cloud_hourly_profiles_$(start_date).nc" where start_date should specify the date data starts on formatted YYYYMMDD. We require clwc and ciwc profiles because these are typical targets for calibration but are not needed to run the simulation directly.
  2. Instantaneous variables, including surface temperature ts which should be stored in "hourly_inst_$(start_date).nc"
  3. Accumulated variables, including surface sensible and latent heat fluxes, hfls and hfss, which should be stored in "hourly_accum_$(start_date).nc". These need to be divided by the appropriate time resolution, which for hourly data is 3600 and for daily and monthly data is 86400 (not a typo see here).
On HPC/Clima

To run locations already in the artifact, e.g., sites in the tropical Pacific in the first 5 days of July 2007 the config file will work out of the box. To run other locations or times please follow the steps for local.

On local

To run the simulation on a local machine you will need to first download the reanalysis data from ECMWF, ensuring that you have all the required variables. This will be stored in 3 separate files which should be all placed in the same directory. The user must use their Overrides.toml in their .julia/artifacts path to make the era5_hourly_atmos_raw artifact point to the folder where the data is stored. For the raw data and location for processed files you'll need to specify the path where the data is stored and where to store the files as follows:

8234def2ead82e385a330a48ed2f0c030e434065 = "/some/random/path/raw_data_dir" # for raw data
a1a465e8d237d78bef1e6d346054da395787a9f9 = "/some/random/path/processed_files" # for storing

Good luck! :wink: