Saving the diagnostics
Writers are needed to save the computed diagnostics.
ClimaDiagnostics comes with three writers:
NetCDFWriter, to interpolate and save to NetCDF files;DictWriter, to saveFields to dictionaries in memory;HDF5Writer, to saveFields to HDF5 files.
(There is an additional DummyWriter that does nothing. It is mostly used internally for testing and debugging.)
Users are welcome to implement their own writers. A writer has to be a subtype of AbstractWriterand has to implement the interpolate_field! and write_field! methods. interpolate_field! can return nothing is no interpolation is needed.
NetCDFWriter
The NetCDFWriter resamples the input Field to a rectangular grid and saves the output to a NetCDF file.
The NetCDFWriter relies on the Remappers module in ClimaCore to interpolate onto the rectangular grid. Horizontally, this interpolation is a Lagrange interpolation, vertically, it is a linear. This interpolation is not conservative. Also note that, the order of vertical interpolation drops to zero in the first and last vertical elements of each column.
To create a NetCDFWriter, you need to specify the source ClimaCore Space and the output directory where the files should be saved. By default, the NetCDFWriter appends to existing files and create new ones if they do not exist. The NetCDFWriter does not overwrite existing data and will error out if existing data is inconsistent with the new one.
Optionally (recommended), you can pass an optional argument start_date, which will be saved as an attribute of your NetCDF file, easily accessible.
NetCDFWriters take as one of the inputs the desired number of points along each of the dimensions. For the horizontal dimensions, points are sampled linearly. For the vertical dimension, the behavior can be customized by passing the z_sampling_method variable. When z_sampling_method = ClimaDiagnostics.Writers.LevelMethod(), points evaluated on the grid levels (and the provided number of points ignored), when z_sampling_method = ClimaDiagnostics.Writers.FakePressureLevelsMethod(), points are sampled uniformly in simplified hydrostatic atmospheric model.
The output in the NetCDFWriter roughly follows the CF conventions.
Each ScheduledDiagnostic is output to a different file with name determined by calling the output_short_name on the ScheduledDiagnostic. Typically, these files have names like ta_1d_max.nc, ha_20s_inst.nc, et cetera. The files define their dimensions (lon, lat, z, ...). Time is always the first dimension is any dataset.
Do not forget to close your writers to avoid file corruption!
To help reducing data loss, NetCDFWriter can force syncing, i.e. flushing the values to disk. Usually, NetCDF buffers writes to disk (because they are expensive), meaning values are not immediately written but are saved to disk in batch. This can result in data loss, and it is often useful to force NetCDF to write to disk (this is especially the case when working with GPUs). To do so, you can pass the sync_schedule function to the constructor of NetCDFWriter. When not nothing, sync_schedule is a callable that takes one argument (the integrator) and returns a bool. When the bool is true, the files that were modified since the last sync will be synced. For example, to force sync every 1000 steps, you can pass the ClimaDiagnostics.Schedules.DivisorSchedule(1000) schedule. By default, on GPUs, we call sync at the end of every time step for those files that need to be synced.
Variables are saved as datasets with attributes, where the attributes include long_name, standard_name, units...
Global attributes can be added to the NetCDF files via the global_attribs keyword argument for the NetCDFWriter. For example, you may want to specify the source and experiment attributes, which are the same across all NetCDF files produced for a single simulation.
writer = NetCDFWriter(
space, # 2D space with longitudes and latitudes
output_dir;
global_attribs = Dict("source" => "CliMA Coupler Simulation", "experiment" => "AMIP"),
)The global attributes must be a subtype of AbstractDict{String, String}. If the order of the attributes matters, you may want to use an OrderedDict from OrderedCollections.jl.
The NetCDFWriter cannot save raw ClimaCore.Fields, only fields that are resampled onto a Cartesian grids are supported. If you need such capability, consider using the ClimaDiagnostics.Writers.HDF5Writer.
ClimaDiagnostics.Writers.NetCDFWriter — Method
NetCDFWriter(space, output_dir)Save a ScheduledDiagnostic to a NetCDF file inside the output_dir of the simulation by performing a pointwise (non-conservative) remapping first.
Keyword arguments
space:Spacewhere theFieldsare defined. This is the most general space across theFields. In general, this is a 3D space. From a 3D space, you can take slices and write 2D Fields, but the opposite is not true.output_dir: The base folder where the files should be saved.num_points: How many points to use along the different dimensions to interpolate the fields. This is a tuple of integers, typically having meaning Long-Lat-Z, or X-Y-Z (the details depend on the configuration being simulated).z_sampling_method: Instance of aAbstractZSamplingMethodthat determines how points on the vertical direction should be chosen. By default, the vertical points are sampled on the grid levels.compression_level: How much to compress the output NetCDF file (0 is no compression, 9 is maximum compression).sync_schedule: Schedule that determines when to callNetCDF.sync(to flush the output to disk). WhenNetCDF.syncis called, you can guarantee that the bits are written to disk (instead of being buffered in memory). A schedule is a boolean callable that takes as a single argument theintegrator.sync_schedulecan also be set asnothing, in which case we let handling buffered writes to disk.start_date: Date of the beginning of the simulation.horizontal_pts: A tuple of vectors of floats meaning Long-Lat or X-Y (the details depend on the configuration being simulated).global_attribs: Optional dictionary of global attributes to include in all NetCDF files produced by thisNetCDFWriter. These attributes are useful for storing metadata such assource,creation_date, orfrequency. Must benothingor a subtype ofAbstractDict{String, String}. Default isnothing.
ClimaDiagnostics.Writers.interpolate_field! — Method
interpolate_field!(writer::NetCDFWriter, field, diagnostic, u, p, t)Perform interpolation of field and save output in preallocated areas of writer.
ClimaDiagnostics.Writers.write_field! — Method
write_field!(writer::NetCDFWriter, field::Fields.Field, diagnostic, u, p, t)Save the resampled field produced by diagnostic as directed by the writer.
Only the root process does something here.
Note: It assumes that the field is already resampled.
The target file is determined by output_short_name(diagnostic). If the target file already exists, append to it. If not, create a new file. If the file does not contain dimensions, they are added the first time something is written.
Time handling:
- For reduced diagnostics: timestamps are stored at the START of the reduction period, with time_bnds showing [start, end] of the period. For the first write, t=0 is assumed; for subsequent writes, the end of the previous period is used.
- For instantaneous diagnostics: timestamps are stored at the current time, with timebnds showing [previoustime, current_time].
Attributes are appended to the dataset:
short_namelong_nameunitscommentsstart_date
ClimaDiagnostics.Writers.sync — Method
sync(writer::NetCDFWriter)Call NCDatasets.sync on all the files in the writer.unsynced_datasets list. NCDatasets.sync ensures that the values are written to file.
Base.close — Method
close(writer::NetCDFWriter)Close all the open files in writer.
Sampling methods for the vertical direction:
ClimaDiagnostics.Writers.AbstractZSamplingMethod — Type
AbstractZSamplingMethodThe AbstractZInterpolationMethod defines how points along the vertical axis should be sampled.
In other words, if a column is defined between 0 and 100 and the target number of points is 50, how should those 50 points be chosen?
Available methods are:
LevelMethod: just use the grid levelsFakePressureLevelsMethod: linearly spaced in (very) approximate atmospheric pressure levels
ClimaDiagnostics.Writers.LevelsMethod — Type
LevelsMethodDo not perform interpolation on z, use directly the grid levels instead.
ClimaDiagnostics.Writers.FakePressureLevelsMethod — Type
FakePressureLevelsMethodLinearly sample points from z_min to z_max in pressure levels assuming a very simplified hydrostatic balance model.
Pressure is approximated with
p ~ p₀ exp(-z/H)
H is assumed to be 7000 m, which is a good scale height for the Earth atmosphere.
DictWriter
The DictWriter is a in-memory writer that is particularly useful for interactive work and debugging.
ClimaDiagnostics.Writers.DictWriter — Method
DictWriter()A simple in-memory writer. Useful for interactive work and debugging.
You can retrieve values using the typical dictionary interface and using as keys the names of the stored diagnostics.
Example
Assuming we have a diagnostic with short output name "mydiag" stored in dictW. dictW["mydiag"] will be a dictionary with keys the timesteps when the data was saved. The values are the diagnostic output (typically a ClimaCore Field).
ClimaDiagnostics.Writers.write_field! — Method
write_field!(writer::DictWriter, field, diagnostic, u, p, t)Add an entry to the writer at time t for the current diagnostic with value field.
DictWriter is backed by a dictionary. Most typically, the keys of this dictionary are either strings, the output_short_name of the diagnostic. If the output_short_name is not available, use the diagnostic itself. The values of this dictionary is another dictionary that maps the time t to the field at that value.
DictWriter implements a basic read-only dictionary interface to access the times and values.
HDF5Writer
The HDF5Writer writes the Field directly to an HDF5 file in such a way that it can be later read and imported using the InputOutput module in ClimaCore.
The HDF5Writer writes one file per variable per timestep. The name of the file is determined by the output_short_name field of the ScheduledDiagnostic that is being output.
Note: The
HDF5WriterinClimaDiagnosticsis currently the least developed one. If you need this writer, we can expand it.
ClimaDiagnostics.Writers.HDF5Writer — Type
HDF5Writer(output_dir)Save a ScheduledDiagnostic to a HDF5 file inside the output_dir.
ClimaDiagnostics.Writers.write_field! — Method
write_field!(writer::HDF5Writer, field, diagnostic, u, p, t)Save a ScheduledDiagnostic to a HDF5 file inside the output_dir.
The name of the file is determined by the output_short_name of the output ScheduledDiagnostic. New files are created for each timestep.
Fields can be read back using the InputOutput module in ClimaCore.
Base.close — Method
close(writer::HDF5Writer)Close all the files open in writer. (Currently no-op.)