Using ClimaLand Diagnostics to save simulation output
When running a ClimaLand simulations, you have multiple options on how to write the outputs of that simulation. You may want all variables, or just a selected few. You may want instantaneous values, at the highest temporal and spatial resolution, or you may want to get averages at hourly or monthly time scale, and integrate in space (for example soil moisture from 0 to 1 meter depth). You may want to get more specific reductions, such as 10 days maximums, or compute a new variables as a function of others. You may want to get your outputs in memory in a Julia dictionary, or write them in a NetCDF file.
This is where ClimaLand Diagnostics comes in for users.
In this documentation page, we first explain how to use default diagnostics and what are the defaults, and then explain how to define your own diagnostics for more advanced users.
Default Diagnostics
Diagnostics refer to output saved during the simulation, which may be prognostic or diagnostic variables. Note that this is different from checkpointing the simulation, which has specific requirements. For information about checkpointing and restarting simulations, please see the page titled Restarting Simulations.
The main user-facing function in the ClimaLand.Diagnostics module is default_diagnostics
. This function defines what diagnostic variables to compute by default for a specific model, and on what schedule (for example, hourly average).
default_diagnostics
takes in the following arguments:
model
: The ClimaLand model to generate diagnostics for. Currently the following models support diagnostics:CanopyModel
,EnergyHydrologyModel
,SoilCanopyModel
,LandModel
,BucketModel
start_date
: The start date of the simulation.output_writer
: An object of typeClimaDiagnostics.AbstractWriter
. Specifically this may be aNetCDFWriter
,HDF5Writer
, orDictWriter
, which save output to a NetCDF file, HDF5 file, or dictionary in Julia memory, respectively. For more details about the diagnostics writers, please see the ClimaDiagnostics.jl documentation.output_vars
: This argument may be:short
to output a default list of variables defined for each model,:long
to output all
available variables for the model, or a user-defined list of variable "short names".
reduction_period
: How often to compute and save the provided temporal reduction of the diagnostics.reduction_type
: A temporal reduction to apply to the diagnostics over each provided time period, e.g.:average
,:max
,:min
,:instantaneous
(no reduction)dt
: Simulation timestep; only required for instantaneous diagnostics.
Example: diagnostics for a CanopyModel
The following code sets up default short diagnostics to be averaged hourly and written in memory as a Julia dictionary:
diag_writer = ClimaDiagnostics.Writers.DictWriter();
diagnostics = ClimaLand.Diagnostics.default_diagnostics(
canopy,
start_date;
output_vars = :short,
output_writer = diag_writer,
reduction_period = :hourly,
reduction_type = :average,
);
To instead output a list of specific diagnostics, you can change the value of output_vars
. For example, to output gross primary productivity (GPP) and transpiration you would do the following:
diag_writer = ClimaDiagnostics.Writers.DictWriter();
diagnostics = ClimaLand.Diagnostics.default_diagnostics(
canopy,
start_date;
output_vars = ["gpp", "trans"],
output_writer = diag_writer,
reduction_period = :hourly,
reduction_type = :average,
);
A description of available diagnostics and their short names can be found on the Available diagnostic variables documentation page.
To write the diagnostics to a NetCDF file instead of saving it in memory, the diag_writer
should be constructed as a NetCDFWriter
and then passed to default_diagnostics
as before:
outdir = "my_output_dir"
nc_writer =
ClimaDiagnostics.Writers.NetCDFWriter(space, outdir; start_date)
The NetCDFWriter
currently writes to file for each diagnostic output, which can be quite slow when saving variables at every step. On the other hand, the DictWriter
saves output to memory which may be too large in global runs, so DictWriter
usually should not be used for global runs. In general, we recommend using DictWriter
for column simulations, and NetCDFWriter
for global simulations.
Diagnostics output naming and format
Diagnostics are typically named using the format $(short_name)_$(period)_$(reduction)
. For example, with the NetCDFWriter, hourly-averaged GPP would be saved in an output file titled gpp_1h_average.nc
.
The specific output format depends on which output writer is being used; for more details, please see the ClimaDiagnostics documentation.
Adding new diagnostics
To define a new, custom diagnostic, you must follow these steps:
- specify how to compute the diagnostic
- manually define the diagnostic via
add_diagnostic_variable!
- add the diagnostic to the list of possible diagnostics for the relevant model(s)
Define how to compute your diagnostic variable from your model state and cache.
These functions are defined in src/diagnostics/land_compute_methods.jl
and must be named in the format compute_[standard_name]!
. Be sure to write method(s) for each model you want this diagnostic to be available for. For example, let's say you want the bowen ratio (ratio between sensible heat and latent heat) in the Bucket model.
function compute_bowen_ratio!(out, Y, p, t, land_model::BucketModel)
if isnothing(out)
return copy(p.bucket.turbulent_fluxes.shf / p.bucket.turbulent_fluxes.lhf)
else
out .= p.bucket.turbulent_fluxes.shf / p.bucket.turbulent_fluxes.lhf
end
end
Or, for convenience, you can use the @diagnostic_compute
macro which generates the same function. However, it is better to use that macro only if you are retrieving an existing variable, such as latent heat flux, rather than computing one, like the Bowen ratio above. For example,
@diagnostic_compute "latent_heat_flux" BucketModel p.bucket.turbulent_fluxes.lhf
Add that diagnostic(s) variable to your list of variables
These functions are defined in src/diagnostics/define_diagnostics.jl
.
add_diagnostic_variable!(
short_name = "bor",
long_name = "Bowen ratio",
standard_name = "bowen_ratio",
units = "",
comments = "Ratio of sensible to latent heat flux.",
compute! = (out, Y, p, t) -> compute_bowen_ratio!(out, Y, p, t, land_model),
)
add_diagnostic_variable!(
short_name = "lhf",
long_name = "Latent Heat Flux",
standard_name = "latent_heat_flux",
units = "W m^-2",
comments = "Exchange of energy at the land-atmosphere interface due to water evaporation or sublimation.",
compute! = (out, Y, p, t) ->
compute_latent_heat_flux!(out, Y, p, t, land_model),
)
Define how often to output your variables, and how to combine them.
We support a range of reduction periods and reduction types, defined in src/diagnostics/default_diagnostics.jl
, which should be passed to default_diagnostics
as Symbols. Currently, the following periods are supported:
- :every_dt (requires
dt
to be provided separately) - :halfhourly
- :hourly
- :daily
- :tendaily
- :monthly
And the following reduction types are supported:
- :instantaneous
- :average (requires
pre_output_hook! = average_pre_output_hook!
when creating theScheduledDiagnostic
) - :max
- :min
You can also define your own output frequencies and reductions. For example, if you want the seasonal maximum of your variables, where season is defined as 90 days, you could add the following get_period
function, and then provide this new reduction period to default_diagnostics
.
get_period(:seasonal) = Day(90)
default_diagnostics(model, start_date, output_writer; output_vars, reduction_period = :seasonal, reduction_type = :max)
Analyze your simulation output
Once you've run your simulation and created an output folder (e.g., output_dir) with diagnostics, you can use ClimaAnalysis to access and analyze your data. For in depth documentation about ClimaAnalysis, see its documentation.
Here is an example of how to plot a variable:
import ClimaAnalysis
import ClimaAnalysis.Visualize as viz
import CairoMakie # the plotting package used by ClimaAnalysis
simdir = ClimaAnalysis.SimDir(output_dir) # where output_dir is where you saved your diagnostics.
var = get(simdir; "lhf") # assuming lhf, latent_heat_flux used as an example above, is one of your diagnostics variables.
fig = CairoMakie.Figure() # creates an empty figure object
viz.plot!(fig, var) # creates an axis inside fig, and plot your var in it.
CairoMakie.save(fig) # saves the figure in current working directory
API
ClimaLand.Diagnostics.default_diagnostics
— Functiondefault_diagnostics(model::AbstractModel{FT},
start_date;
output_writer,
output_vars = :short,
reduction_period = :monthly,
reduction_type = :average,
dt = nothing)
Construct a list of ScheduledDiagnostics
that outputs the given variables at the specified average period.
The input output_vars
can have 3 values:
:long
- all diagnostics are output:short
- a short list of diagnostics is output_::Vector{String}
- a user-defined list of diagnostics is output
If a user-defined list is provided for output_vars
, it must be a vector of strings that are valid short names of diagnostics for the model. Please see the method get_possible_diagnostics
for the list of available diagnostics for each model.
reduction_period
specifies the frequency at which to average the diagnostics, and reduction_type
specifies the type of reduction to apply. Please see the docstring of get_period
for the list of available periods, and the docstring of get_reduction
for the list of available reduction types.
This method can be extended for any model that extends get_possible_diagnostics
and get_short_diagnostics
. Note that EnergyHydrology
has a specialized method that handles conservation diagnostics.
default_diagnostics(
land_model::EnergyHydrology{FT},
start_date;
output_writer,
output_vars = :short,
reduction_period = :monthly,
reduction_type = :average,
conservation = false,
conservation_period = Day(10),
dt = nothing,
) where {FT}
Define a method specific to the EnergyHydrology model so that we can handle conservation diagnostics specially.
The input output_vars
can have 3 values:
:long
- all diagnostics are output:short
- a short list of diagnostics is output_::Vector{String}
- a user-defined list of diagnostics is output
If a user-defined list is provided for output_vars
, it must be a vector of strings that are valid short names of diagnostics for the model.
reduction_period
specifies the frequency at which to average the diagnostics, and reduction_type
specifies the type of reduction to apply. Please see the docstring of get_period
for the list of available periods, and the docstring of get_reduction
for the list of available reduction types.
Conservation diagnostics should not be provided as part of the output_vars
argument, but rather included by providing conservation = true
. Please see the method get_possible_diagnostics
for the list of available diagnostics.
ClimaLand.Diagnostics.close_output_writers
— Functionclose_output_writers(diagnostics)
Close the output writers in the diagnostics
, an iterable of ClimaDiagnostics.ScheduledDiagnostic
or nothing
.
This function should be called at the end of every simulation.
ClimaLand.total_energy_per_area!
— FunctionClimaLand.total_energy_per_area!(
surface_field,
model::EnergyHydrology,
Y,
p,
t,
)
A function which updates surface_field
in place with the value for the total energy per unit ground area for the EnergyHydrology
.
ClimaLand.total_energy_per_area!(
surface_field,
model::SnowModel,
Y,
p,
t,
)
A function which updates surface_field
in place with the value for the total energy per unit ground area for the SnowModel
.
This has already accounted for the area fraction of snow in the definition of S.
ClimaLand.total_energy_per_area!(
surface_field,
model::BucketModel,
Y,
p,
t,
)
A function which updates surface_field
in place with the value for the total energy per unit ground area for the BucketModel
.
The ground of the bucket model has temperature Y.bucket.T
, with volumetric specific heat approximated with the parameter ρc_soil
. Additional energy is present due to the latent heat of fusion of frozen water, in the form of snow. We also add in this energy (below). We do not model or account for the sensible energy of snow (ρ_snow T_snow
), as this is much smaller.
total_energy_per_area!(cache, model::AbstractModel, Y, p, t)
A function which updates cache
in place with the total energy per unit ground area for the model
, computed from Y
, p
, and t
.
total_energy_per_area!(
surface_field,
land::AbstractLandModel,
Y,
p,
t,
sfc_cache,
)
A function which computes the total energy per unit area and updates surface_field
in place, for the land model land
, by calling the same function for the component models.
The sfc_cache
field is available as scratch space.
ClimaLand.total_energy_per_area!(
surface_field,
model::BigLeafEnergyModel,
Y,
p,
t,
)
A function which updates surface_field
in place with the value of the big leaf model's energy.
Note that this assumes that there is at most a single canopy and stem component, and that the area index for them refers to the integrated area index (in height) - not the value per layer.
ClimaLand.total_energy_per_area!(
surface_field,
model::AbstractCanopyEnergyModel,
Y,
p,
t,
)
A default function which errors for generic energy models for the canopy.
Note that we only support two models for canopy energy. The BigLeafEnergyModel
has a special method for this, and the other has the temperature prescribed and does not conserve energy.
ClimaLand.total_energy_per_area!(
surface_field,
model::CanopyModel,
Y,
p,
t,
)
A function which updates surface_field
in place with the value for the total energy per unit ground area for the CanopyModel
.
This acts by calling the method for the energy component of the canopy model.
ClimaLand.total_liq_water_vol_per_area!
— FunctionClimaLand.total_liq_water_vol_per_area!(
surface_field,
model::RichardsModel,
Y,
p,
t,
)
A function which updates surface_field
in place with the value for the total liquid water volume per unit ground area for the RichardsModel
.
ClimaLand.total_liq_water_vol_per_area!(
surface_field,
model::EnergyHydrology,
Y,
p,
t,
)
A function which updates surface_field
in place with the value for the total liquid water volume per unit ground area for the EnergyHydrology
.
The water in all phases is accounted for by converting ice volume to liquid water volume using the ratio of the density of ice to the density of water.
ClimaLand.total_liq_water_vol_per_area!(
surface_field,
model::SnowModel,
Y,
p,
t,
)
A function which updates surface_field
in place with the value for the total liquid water volume per unit ground area for the SnowModel
.
This has already accounted for the area fraction of snow in the definition of S; it also accounts for both liquid and frozen water present in the snow, as the snow water equivalent is already the total liquid water volume present in the snow if all the snow melted, per unit ground area.
ClimaLand.total_liq_water_vol_per_area!(
surface_field,
model::BucketModel,
Y,
p,
t,
)
A function which updates surface_field
in place with the value for the total liquid water volume per unit ground area for the BucketModel
, in [m³ m⁻²].
The total water contained in the bucket is the sum of the subsurface water storage W
, the snow water equivalent σS
and surface water content Ws
.
total_liq_water_vol_per_area!(cache, model::AbstractModel, Y, p, t)
A function which updates cache
in place with the total liquid water volume per unit ground area for the model
, computed from Y
, p
, and t
.
While water mass is the fundamentally conserved quantity, soil modelling represents water by an equivalent water volume using the density of water and ice at standard temperature and pressure. Because of that, we report here the total volume of water present in a model (per unit area) that would arise if all the water was in liquid phase. This can be converted to a mass using the density of liquid water.
This includes the water in multiple phases. For example, if ice is present, the water volume is computed using ratio of the density of ice to the density of liquid water.
total_liq_water_vol_per_area!(
surface_field,
land::AbstractLandModel,
Y,
p,
t,
sfc_cache,
)
A function which computes the total liquid water volume per unit area and updates surface_field
in place, for the land model land
, by calling the same function for the component models.
The sfc_cache
field is available as scratch space.
ClimaLand.total_liq_water_vol_per_area!(
surface_field,
model::CanopyModel,
Y,
p,
t,
)
A function which updates surface_field
in place with the value for the total liquid water volume per unit ground area for the CanopyModel
.
This acts by calling the method for the PlantHydraulics component of the canopy model.
ClimaLand.total_liq_water_vol_per_area!(
surface_field,
model::PlantHydraulicsModel,
Y,
p,
t,
)
A function which updates surface_field
in place with the value of the plant hydraulic models total water volume.
Note that this is general for any number of canopy layers, but it assumes that the LAI and SAI given are per layer. This is distinct from the BigLeaf approach, in which the LAI and SAI refer to the integrated area index with heigh.