FlatVar
To help with data preparation and feeding data from OutputVar
s into other pipelines, ClimaAnalysis provides two functions flatten
to flatten an OutputVar
into a FlatVar
and unflatten
to unflatten a FlatVar
into an OutputVar
. A FlatVar
consists of data
which is a one-dimensional vector of the OutputVar
's data and metadata
which contain all the necessary information to reconstruct the original OutputVar
.
The problems that FlatVar
aim to solve are
- Flattening can be error prone if one does not keep track of the ordering of the dimensions,
- Flattening a one-dimensional vector loses important information, such as what quantity is represented by the
OutputVar
and the ordering of the dimensions.
Flatten
To solve the first problem, ClimaAnalysis enforces the ordering of the dimensions when flattening to be ("longitude", "latitude", "pressure", "altitude", "time")
and omits the dimensions that do not exist in the OutputVar
.
The example below demonstrates how the order of the dimensions of a OutputVar
does not matter when flattening the data. The order of dimensions of flat_var
is time
, lon
, and lat
, and the order of dimensions of permuted_var
is lon
, lat
, and time
. When both OutputVar
s are flattened, the flattened data of both OutputVar
s are the same. As such, using flatten
removes the need to keep track of the order of the dimensions of a OutputVar
.
julia> var
Attributes: short_name => pr Dimension attributes: time: units => s lon: units => degrees_east lat: units => degrees_north Data defined over: time with 3 elements (0.0 to 5.0) lon with 5 elements (-60.0 to 60.0) lat with 4 elements (-90.0 to 90.0)
julia> flat_var = ClimaAnalysis.flatten(var);
julia> permuted_var = permutedims(var, ("longitude", "latitude", "time"));
julia> flat_permuted_var = ClimaAnalysis.flatten(permuted_var);
julia> isequal(flat_permuted_var.data, flat_var.data)
true
The data can be extracted by flat_var.data
and the metadata can be extracted by flatvar.metadata
. More information about the metadata will discussed in the section below.
Furthermore, if ignore_nan = true
, then NaNs
are excluded when flattening the data.
julia> count(isnan, nan_var.data) # nan_var is the same as var, but contains three NaNs
3
julia> flat_nan_var = ClimaAnalysis.flatten(nan_var, ignore_nan = true); # default is true
julia> length(flat_nan_var.data)
57
Unflatten
To solve the second problem, there is the metadata
field in FlatVar
that stores the necessary information to fully reconstruct the OutputVar
. With unflatten
, one can call unflatten
on both FlatVar
or on metadata
and data
to reconstruct the OutputVar
. This decoupling of metadata
and data
also means that one needs to be careful that the correct metadata
and data
are being used to reconstruct the Outputvar
.
See the example below of unflattening flat_var
.
julia> unflatten_var = ClimaAnalysis.unflatten(flat_var)
Attributes: short_name => pr Dimension attributes: time: units => s lon: units => degrees_east lat: units => degrees_north Data defined over: time with 3 elements (0.0 to 5.0) lon with 5 elements (-60.0 to 60.0) lat with 4 elements (-90.0 to 90.0)
julia> isequal(unflatten_var.data, var.data)
true
One can also unflatten using the metadata and data of flat_var
.
julia> unflatten_var = ClimaAnalysis.unflatten(flat_var.metadata, flat_var.data)
Attributes: short_name => pr Dimension attributes: time: units => s lon: units => degrees_east lat: units => degrees_north Data defined over: time with 3 elements (0.0 to 5.0) lon with 5 elements (-60.0 to 60.0) lat with 4 elements (-90.0 to 90.0)
julia> isequal(unflatten_var.data, var.data)
true