Metadata
Metadata is an abstraction that represents data, but does not embody it. Unlike Oceananigans.Field, which points to an array occupying space in memory, Metadata only contains information about where files are stored, their origin, the grid they live on, and the date(s) they correspond to (if any).
ClimaOcean.DataWrangling.Metadata — Type
Metadata(variable_name;
dataset,
dates = all_dates(dataset, variable_name),
dir = default_download_directory(dataset),
bounding_box = nothing,
start_date = nothing,
end_date = nothing)Metadata holding a specific dataset information.
Argument
variable_name: a symbol representing the name of the variable (for example,:temperature,:salinity,:u_velocity, etc)
Keyword Arguments
dataset: Supported datasets areETOPO2022(),ECCO2Monthly(),ECCO2Daily(),ECCO4Monthly(),EN4Monthly(),GLORYSDaily(),GLORYSMonthly(),RepeatYearJRA55(), andMultiYearJRA55().dates: The dates of the dataset (Dates.AbstractDateTimeorCFTime.AbstractCFDateTime). Note thatdatescan either be a range or a vector of dates, representing a time-series. For a single date, useMetadatum.start_date: Ifdates = nothing, we can prescribe the first date of metadata as a date (Dates.AbstractDateTimeorCFTime.AbstractCFDateTime).start_dateshould lie within the date range of the dataset. Default: nothing.end_date: Ifdates = nothing, we can prescribe the last date of metadata as a date (Dates.AbstractDateTimeorCFTime.AbstractCFDateTime).end_dateshould lie within the date range of the dataset. Default: nothing.bounding_box: Specifies the bounds of the dataset. SeeBoundingBox.dir: The directory where the dataset is stored.
When Metadata represents just one date, we call it Metadatum. For example, consider global temperature from January 1st, 2010 from the EN4 dataset,
using ClimaOcean, Dates
metadatum = Metadatum(:temperature;
dataset = EN4Monthly(),
date = Date(2010, 1, 1))Metadatum{EN4Monthly, Dates.DateTime}:
├── name: temperature
├── dataset: EN4Monthly
├── dates: 2010-01-01 00:00:00
└── dir: /storage5/buildkite-agent/.julia-4898/scratchspaces/0376089a-ecfe-4b0e-a64f-9c555d74d754/EN4To materialize the data described by a metadatum, we wrap it in an Oceananigans' Field,
using Oceananigans
T_native = Field(metadatum)360×173×42 Field{Oceananigans.Grids.Center, Oceananigans.Grids.Center, Oceananigans.Grids.Center} on Oceananigans.Grids.LatitudeLongitudeGrid on Oceananigans.Architectures.CPU
├── grid: 360×173×42 LatitudeLongitudeGrid{Float32, Oceananigans.Grids.Periodic, Oceananigans.Grids.Bounded, Oceananigans.Grids.Bounded} on Oceananigans.Architectures.CPU with 3×3×3 halo and with precomputed metrics
├── boundary conditions: FieldBoundaryConditions
│ └── west: Periodic, east: Periodic, south: ZeroFlux, north: ZeroFlux, bottom: ZeroFlux, top: ZeroFlux, immersed: Nothing
└── data: 366×179×48 OffsetArray(::Array{Float32, 3}, -2:363, -2:176, -2:45) with eltype Float32 with indices -2:363×-2:176×-2:45
└── max=30.7835, min=-3.99887, mean=6.16979We can also interpolate the data on a user-defined grid by using the function set!,
grid = LatitudeLongitudeGrid(size = (360, 90, 1),
latitude = (-90, 90),
longitude = (0, 360),
z = (0, 1))
T = CenterField(grid)
set!(T, metadatum)360×90×1 Field{Oceananigans.Grids.Center, Oceananigans.Grids.Center, Oceananigans.Grids.Center} on Oceananigans.Grids.LatitudeLongitudeGrid on Oceananigans.Architectures.CPU
├── grid: 360×90×1 LatitudeLongitudeGrid{Float64, Oceananigans.Grids.Periodic, Oceananigans.Grids.Bounded, Oceananigans.Grids.Bounded} on Oceananigans.Architectures.CPU with 3×3×1 halo and with precomputed metrics
├── boundary conditions: FieldBoundaryConditions
│ └── west: Periodic, east: Periodic, south: Value, north: Value, bottom: ZeroFlux, top: ZeroFlux, immersed: Nothing
└── data: 366×96×3 OffsetArray(::Array{Float64, 3}, -2:363, -2:93, 0:2) with eltype Float64 with indices -2:363×-2:93×0:2
└── max=30.773, min=-3.99792, mean=12.3218and then we can plot it:
using CairoMakie
heatmap(T)
This looks a bit odd, but less so if we download bathymetry (for which we also use Metadata under the hood) to create a temperature field with a land mask,
bottom_height = regrid_bathymetry(grid)
grid = ImmersedBoundaryGrid(grid, GridFittedBottom(bottom_height))
T = CenterField(grid)
set!(T, metadatum)
heatmap(T)
The key ingredients stored in a Metadata or Metadatum object are:
- the variable name (for example
:temperatureor:u_velocity); - the dataset (such as
EN4Monthly,ECCO2Daily, orGLORYSMonthly); - the temporal coverage: either a single timestamp (
Metadatum) or a range/vector of dates (Metadata); - an optional
BoundingBoxdescribing regional subsets in longitude, latitude, or depth; - the on-disk
directory where the dataset are be cached.
This bookkeeping lets downstream utilities (for example set! or FieldTimeSeries) request exactly the slices of data they need, and it keeps track of where those slices live so we do not redownload them unnecessarily.
Supported datasets
ClimaOcean currently ships connectors for the following data products:
| Dataset | Supported Variables | Documentation Link |
|---|---|---|
ETOPO2022 | Supported variables | NOAA ETOPO 2022 overview |
ECCO2Monthly | Supported variables | ECCO2 documentation |
ECCO2Daily | Supported variables | ECCO2 documentation |
ECCO4Monthly | Supported variables | ECCO V4r4 product guide |
EN4Monthly | Supported variables | Met Office EN4 overview |
GLORYSDaily | Supported variables | Copernicus GLORYS product page |
GLORYSMonthly | Supported variables | Copernicus GLORYS product page |
RepeatYearJRA55 | Supported variables | JRA-55 Reanalysis |
MultiYearJRA55 | Supported variables | JRA-55 Reanalysis |