Metadata

Metadata is an abstraction that represents data, but does not embody it. Unlike Oceananigans.Field, which points to an array occupying space in memory, Metadata only contains information about where files are stored, their origin, the grid they live on, and the date(s) they correspond to (if any).

ClimaOcean.DataWrangling.MetadataType
Metadata(variable_name;
         dataset,
         dates = all_dates(dataset, variable_name),
         dir = default_download_directory(dataset),
         bounding_box = nothing,
         start_date = nothing,
         end_date = nothing)

Metadata holding a specific dataset information.

Argument

  • variable_name: a symbol representing the name of the variable (for example, :temperature, :salinity, :u_velocity, etc)

Keyword Arguments

  • dataset: Supported datasets are ETOPO2022(), ECCO2Monthly(), ECCO2Daily(), ECCO4Monthly(), EN4Monthly(), GLORYSDaily(), GLORYSMonthly(), RepeatYearJRA55(), and MultiYearJRA55().

  • dates: The dates of the dataset (Dates.AbstractDateTime or CFTime.AbstractCFDateTime). Note that dates can either be a range or a vector of dates, representing a time-series. For a single date, use Metadatum.

  • start_date: If dates = nothing, we can prescribe the first date of metadata as a date (Dates.AbstractDateTime or CFTime.AbstractCFDateTime). start_date should lie within the date range of the dataset. Default: nothing.

  • end_date: If dates = nothing, we can prescribe the last date of metadata as a date (Dates.AbstractDateTime or CFTime.AbstractCFDateTime). end_date should lie within the date range of the dataset. Default: nothing.

  • bounding_box: Specifies the bounds of the dataset. See BoundingBox.

  • dir: The directory where the dataset is stored.

source

When Metadata represents just one date, we call it Metadatum. For example, consider global temperature from January 1st, 2010 from the EN4 dataset,

using ClimaOcean, Dates

metadatum = Metadatum(:temperature;
                      dataset = EN4Monthly(),
                      date = Date(2010, 1, 1))
Metadatum{EN4Monthly, Dates.DateTime}:
├── name: temperature
├── dataset: EN4Monthly
├── dates: 2010-01-01 00:00:00
└── dir: /storage5/buildkite-agent/.julia-4898/scratchspaces/0376089a-ecfe-4b0e-a64f-9c555d74d754/EN4

To materialize the data described by a metadatum, we wrap it in an Oceananigans' Field,

using Oceananigans

T_native = Field(metadatum)
360×173×42 Field{Oceananigans.Grids.Center, Oceananigans.Grids.Center, Oceananigans.Grids.Center} on Oceananigans.Grids.LatitudeLongitudeGrid on Oceananigans.Architectures.CPU
├── grid: 360×173×42 LatitudeLongitudeGrid{Float32, Oceananigans.Grids.Periodic, Oceananigans.Grids.Bounded, Oceananigans.Grids.Bounded} on Oceananigans.Architectures.CPU with 3×3×3 halo and with precomputed metrics
├── boundary conditions: FieldBoundaryConditions
│   └── west: Periodic, east: Periodic, south: ZeroFlux, north: ZeroFlux, bottom: ZeroFlux, top: ZeroFlux, immersed: Nothing
└── data: 366×179×48 OffsetArray(::Array{Float32, 3}, -2:363, -2:176, -2:45) with eltype Float32 with indices -2:363×-2:176×-2:45
    └── max=30.7835, min=-3.99887, mean=6.16979

We can also interpolate the data on a user-defined grid by using the function set!,

grid = LatitudeLongitudeGrid(size = (360, 90, 1),
                             latitude = (-90, 90),
                             longitude = (0, 360),
                             z = (0, 1))
T = CenterField(grid)
set!(T, metadatum)
360×90×1 Field{Oceananigans.Grids.Center, Oceananigans.Grids.Center, Oceananigans.Grids.Center} on Oceananigans.Grids.LatitudeLongitudeGrid on Oceananigans.Architectures.CPU
├── grid: 360×90×1 LatitudeLongitudeGrid{Float64, Oceananigans.Grids.Periodic, Oceananigans.Grids.Bounded, Oceananigans.Grids.Bounded} on Oceananigans.Architectures.CPU with 3×3×1 halo and with precomputed metrics
├── boundary conditions: FieldBoundaryConditions
│   └── west: Periodic, east: Periodic, south: Value, north: Value, bottom: ZeroFlux, top: ZeroFlux, immersed: Nothing
└── data: 366×96×3 OffsetArray(::Array{Float64, 3}, -2:363, -2:93, 0:2) with eltype Float64 with indices -2:363×-2:93×0:2
    └── max=30.773, min=-3.99792, mean=12.3218

and then we can plot it:

using CairoMakie
heatmap(T)
Example block output

This looks a bit odd, but less so if we download bathymetry (for which we also use Metadata under the hood) to create a temperature field with a land mask,

bottom_height = regrid_bathymetry(grid)
grid = ImmersedBoundaryGrid(grid, GridFittedBottom(bottom_height))
T = CenterField(grid)
set!(T, metadatum)
heatmap(T)
Example block output

The key ingredients stored in a Metadata or Metadatum object are:

  • the variable name (for example :temperature or :u_velocity);
  • the dataset (such as EN4Monthly, ECCO2Daily, or GLORYSMonthly);
  • the temporal coverage: either a single timestamp (Metadatum) or a range/vector of dates (Metadata);
  • an optional BoundingBox describing regional subsets in longitude, latitude, or depth;
  • the on-disk directory where the dataset are be cached.

This bookkeeping lets downstream utilities (for example set! or FieldTimeSeries) request exactly the slices of data they need, and it keeps track of where those slices live so we do not redownload them unnecessarily.

Supported datasets

ClimaOcean currently ships connectors for the following data products:

DatasetSupported VariablesDocumentation Link
ETOPO2022Supported variablesNOAA ETOPO 2022 overview
ECCO2MonthlySupported variablesECCO2 documentation
ECCO2DailySupported variablesECCO2 documentation
ECCO4MonthlySupported variablesECCO V4r4 product guide
EN4MonthlySupported variablesMet Office EN4 overview
GLORYSDailySupported variablesCopernicus GLORYS product page
GLORYSMonthlySupported variablesCopernicus GLORYS product page
RepeatYearJRA55Supported variablesJRA-55 Reanalysis
MultiYearJRA55Supported variablesJRA-55 Reanalysis