FlatVar

To help with data preparation and feeding data from OutputVars into other pipelines, ClimaAnalysis provides two functions flatten to flatten an OutputVar into a FlatVar and unflatten to unflatten a FlatVar into an OutputVar. A FlatVar consists of data which is a one-dimensional vector of the OutputVar's data and metadata which contain all the necessary information to reconstruct the original OutputVar.

The problems that FlatVar aim to solve are

  1. Flattening can be error prone if one does not keep track of the ordering of the dimensions,
  2. Flattening a one-dimensional vector loses important information, such as what quantity is represented by the OutputVar and the ordering of the dimensions.

Flatten

To solve the first problem, ClimaAnalysis enforces the ordering of the dimensions when flattening to be ("longitude", "latitude", "pressure", "altitude", "time") and omits the dimensions that do not exist in the OutputVar.

The example below demonstrates how the order of the dimensions of a OutputVar does not matter when flattening the data. The order of dimensions of flat_var is time, lon, and lat, and the order of dimensions of permuted_var is lon, lat, and time. When both OutputVars are flattened, the flattened data of both OutputVars are the same. As such, using flatten removes the need to keep track of the order of the dimensions of a OutputVar.

julia> varAttributes:
  short_name => pr
Dimension attributes:
  time:
    units => s
  lon:
    units => degrees_east
  lat:
    units => degrees_north
Data defined over:
  time with 3 elements (0.0 to 5.0)
  lon  with 5 elements (-60.0 to 60.0)
  lat  with 4 elements (-90.0 to 90.0)
julia> flat_var = ClimaAnalysis.flatten(var);
julia> permuted_var = permutedims(var, ("longitude", "latitude", "time"));
julia> flat_permuted_var = ClimaAnalysis.flatten(permuted_var);
julia> isequal(flat_permuted_var.data, flat_var.data)true

The data can be extracted by flat_var.data and the metadata can be extracted by flatvar.metadata. More information about the metadata will discussed in the section below.

Furthermore, if ignore_nan = true, then NaNs are excluded when flattening the data.

julia> count(isnan, nan_var.data) # nan_var is the same as var, but contains three NaNs3
julia> flat_nan_var = ClimaAnalysis.flatten(nan_var, ignore_nan = true); # default is true
julia> length(flat_nan_var.data)57

Unflatten

To solve the second problem, there is the metadata field in FlatVar that stores the necessary information to fully reconstruct the OutputVar. With unflatten, one can call unflatten on both FlatVar or on metadata and data to reconstruct the OutputVar. This decoupling of metadata and data also means that one needs to be careful that the correct metadata and data are being used to reconstruct the Outputvar.

See the example below of unflattening flat_var.

julia> unflatten_var = ClimaAnalysis.unflatten(flat_var)Attributes:
  short_name => pr
Dimension attributes:
  time:
    units => s
  lon:
    units => degrees_east
  lat:
    units => degrees_north
Data defined over:
  time with 3 elements (0.0 to 5.0)
  lon  with 5 elements (-60.0 to 60.0)
  lat  with 4 elements (-90.0 to 90.0)
julia> isequal(unflatten_var.data, var.data)true

One can also unflatten using the metadata and data of flat_var.

julia> unflatten_var = ClimaAnalysis.unflatten(flat_var.metadata, flat_var.data)Attributes:
  short_name => pr
Dimension attributes:
  time:
    units => s
  lon:
    units => degrees_east
  lat:
    units => degrees_north
Data defined over:
  time with 3 elements (0.0 to 5.0)
  lon  with 5 elements (-60.0 to 60.0)
  lat  with 4 elements (-90.0 to 90.0)
julia> isequal(unflatten_var.data, var.data)true