Skip to content

Creating YAXArrays and Datasets

Here, we use YAXArray when the variables share dimensions and Dataset otherwise.

Creating a YAXArray

julia
using YAXArrays
using DimensionalData: DimensionalData as DD
using DimensionalData
julia
julia> a = YAXArray(rand(10, 20, 5))
╭─────────────────────────────╮
10×20×5 YAXArray{Float64,3}
├─────────────────────────────┴────────────────────────────────── dims ┐
Dim_1 Sampled{Int64} Base.OneTo(10) ForwardOrdered Regular Points,
Dim_2 Sampled{Int64} Base.OneTo(20) ForwardOrdered Regular Points,
Dim_3 Sampled{Int64} Base.OneTo(5) ForwardOrdered Regular Points
├──────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any}()
├─────────────────────────────────────────────────────────── file size ┤
  file size: 7.81 KB
└──────────────────────────────────────────────────────────────────────┘

if no names are defined then default ones will be used, i.e. Dim_1, Dim_2.

Get data from each Dimension with

julia
a.Dim_1
Dim_1 Sampled{Int64} ForwardOrdered Regular DimensionalData.Dimensions.Lookups.Points
wrapping: Base.OneTo(10)

or with

julia
getproperty(a, :Dim_1)
Dim_1 Sampled{Int64} ForwardOrdered Regular DimensionalData.Dimensions.Lookups.Points
wrapping: Base.OneTo(10)

or even better with the DD lookup function

julia
lookup(a, :Dim_1)
Sampled{Int64} ForwardOrdered Regular DimensionalData.Dimensions.Lookups.Points
wrapping: Base.OneTo(10)

Creating a YAXArray with named axis

The two most used axis are RangeAxis and CategoricalAxis. Here, we use a combination of them to create a time, lon and lat axis and a Categorical Axis for two variables.

Axis definitions

julia
julia> using Dates

julia> axlist = (
           Dim{:time}(Date("2022-01-01"):Day(1):Date("2022-01-30")),
           Dim{:lon}(range(1, 10, length=10)),
           Dim{:lat}(range(1, 5, length=15)),
           Dim{:Variable}(["var1", "var2"])
           )
time     Date("2022-01-01"):Dates.Day(1):Date("2022-01-30"),
lon      1.0:1.0:10.0,
lat      1.0:0.2857142857142857:5.0,
Variable ["var1", "var2"]

And the corresponding data

julia
data = rand(30, 10, 15, 2);

then, the YAXArray is

julia
julia> ds = YAXArray(axlist, data)
╭────────────────────────────────╮
30×10×15×2 YAXArray{Float64,4}
├────────────────────────────────┴─────────────────────────────────────── dims ┐
time     Sampled{Date} Date("2022-01-01"):Dates.Day(1):Date("2022-01-30") ForwardOrdered Regular Points,
lon      Sampled{Float64} 1.0:1.0:10.0 ForwardOrdered Regular Points,
lat      Sampled{Float64} 1.0:0.2857142857142857:5.0 ForwardOrdered Regular Points,
Variable Categorical{String} ["var1", "var2"] ForwardOrdered
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any}()
├─────────────────────────────────────────────────────────────────── file size ┤
  file size: 70.31 KB
└──────────────────────────────────────────────────────────────────────────────┘

Select variables

julia
julia> ds[Variable = At("var1"), lon = DD.Between(1,2.1)]
╭─────────────────────────────╮
30×2×15 YAXArray{Float64,3}
├─────────────────────────────┴────────────────────────────────────────── dims ┐
time Sampled{Date} Date("2022-01-01"):Dates.Day(1):Date("2022-01-30") ForwardOrdered Regular Points,
lon  Sampled{Float64} 1.0:1.0:2.0 ForwardOrdered Regular Points,
lat  Sampled{Float64} 1.0:0.2857142857142857:5.0 ForwardOrdered Regular Points
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any}()
├─────────────────────────────────────────────────────────────────── file size ┤
  file size: 7.03 KB
└──────────────────────────────────────────────────────────────────────────────┘

Info

Please note that selecting elements in YAXArrays is done via the DimensionalData.jl syntax. For more information checkout the docs.

julia
julia> subset = ds[
           time = DD.Between( Date("2022-01-01"),  Date("2022-01-10")),
           lon=DD.Between(1,2),
           Variable = At("var2")
           ]
╭─────────────────────────────╮
10×2×15 YAXArray{Float64,3}
├─────────────────────────────┴────────────────────────────────────────── dims ┐
time Sampled{Date} Date("2022-01-01"):Dates.Day(1):Date("2022-01-10") ForwardOrdered Regular Points,
lon  Sampled{Float64} 1.0:1.0:2.0 ForwardOrdered Regular Points,
lat  Sampled{Float64} 1.0:0.2857142857142857:5.0 ForwardOrdered Regular Points
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any}()
├─────────────────────────────────────────────────────────────────── file size ┤
  file size: 2.34 KB
└──────────────────────────────────────────────────────────────────────────────┘

Properties / Attributes

You might also want to add additional properties to your YAXArray. This can be done via a Dictionary, namely

julia
props = Dict(
    "time" => "days",
    "lon" => "longitude",
    "lat" => "latitude",
    "var1" => "first variable",
    "var2" => "second variable",
);

Then the yaxarray with properties is assemble with

julia
julia> ds = YAXArray(axlist, data, props)
╭────────────────────────────────╮
30×10×15×2 YAXArray{Float64,4}
├────────────────────────────────┴─────────────────────────────────────── dims ┐
time     Sampled{Date} Date("2022-01-01"):Dates.Day(1):Date("2022-01-30") ForwardOrdered Regular Points,
lon      Sampled{Float64} 1.0:1.0:10.0 ForwardOrdered Regular Points,
lat      Sampled{Float64} 1.0:0.2857142857142857:5.0 ForwardOrdered Regular Points,
Variable Categorical{String} ["var1", "var2"] ForwardOrdered
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, String} with 5 entries:
  "lat"  => "latitude"
  "var1" => "first variable"
  "time" => "days"
  "var2" => "second variable"
  "lon"  => "longitude"
├─────────────────────────────────────────────────────────────────── file size ┤
  file size: 70.31 KB
└──────────────────────────────────────────────────────────────────────────────┘

Access these properties with

julia
ds.properties
Dict{String, String} with 5 entries:
  "lat"  => "latitude"
  "var1" => "first variable"
  "time" => "days"
  "var2" => "second variable"
  "lon"  => "longitude"

Note that this properties are shared for both variables var1 and var2. Namely, this are global properties for your YAXArray. However, in most cases you will want to pass properties for each variable, here we will do this via Datasets.

Creating a Dataset

Let's define first some range axis

julia
julia> axs = (
           Dim{:lon}(range(0,1, length=10)),
           Dim{:lat}(range(0,1, length=5)),
       )
lon 0.0:0.1111111111111111:1.0,
lat 0.0:0.25:1.0

And two toy random YAXArrays to assemble our dataset

julia
julia> t2m = YAXArray(axs, rand(10,5), Dict("units" => "K", "reference" => "your references"))
╭──────────────────────────╮
10×5 YAXArray{Float64,2}
├──────────────────────────┴───────────────────────────────────────────── dims ┐
lon Sampled{Float64} 0.0:0.1111111111111111:1.0 ForwardOrdered Regular Points,
lat Sampled{Float64} 0.0:0.25:1.0 ForwardOrdered Regular Points
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, String} with 2 entries:
  "units"     => "K"
  "reference" => "your references"
├─────────────────────────────────────────────────────────────────── file size ┤
  file size: 400.0 bytes
└──────────────────────────────────────────────────────────────────────────────┘
julia
julia> prec = YAXArray(axs, rand(10,5), Dict("units" => "mm", "reference" => "your references"))
╭──────────────────────────╮
10×5 YAXArray{Float64,2}
├──────────────────────────┴───────────────────────────────────────────── dims ┐
lon Sampled{Float64} 0.0:0.1111111111111111:1.0 ForwardOrdered Regular Points,
lat Sampled{Float64} 0.0:0.25:1.0 ForwardOrdered Regular Points
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, String} with 2 entries:
  "units"     => "mm"
  "reference" => "your references"
├─────────────────────────────────────────────────────────────────── file size ┤
  file size: 400.0 bytes
└──────────────────────────────────────────────────────────────────────────────┘

Then the Dataset is assembled as

julia
julia> ds = Dataset(t2m=t2m, prec= prec, num = YAXArray(rand(10)),
           properties = Dict("space"=>"lon/lat", "reference" => "your global references"))
YAXArray Dataset
Shared Axes:
()
Variables:

t2m
lon Sampled{Float64} 0.0:0.1111111111111111:1.0 ForwardOrdered Regular Points,
lat Sampled{Float64} 0.0:0.25:1.0 ForwardOrdered Regular Points
prec
lon Sampled{Float64} 0.0:0.1111111111111111:1.0 ForwardOrdered Regular Points,
lat Sampled{Float64} 0.0:0.25:1.0 ForwardOrdered Regular Points
num
Dim_1 Sampled{Int64} Base.OneTo(10) ForwardOrdered Regular Points
Properties: Dict("reference" => "your global references", "space" => "lon/lat")

TIP

Note that the YAXArrays used not necessarily shared the same dimensions. Hence, using a Dataset is more versatile than a plain YAXArray.

Selected Variables in a Data Cube

Being able to collect variables that share dimensions into a data cube is possible with

julia
julia> c = Cube(ds[["t2m", "prec"]])
╭────────────────────────────╮
10×5×2 YAXArray{Float64,3}
├────────────────────────────┴─────────────────────────────────────────── dims ┐
lon      Sampled{Float64} 0.0:0.1111111111111111:1.0 ForwardOrdered Regular Points,
lat      Sampled{Float64} 0.0:0.25:1.0 ForwardOrdered Regular Points,
Variable Categorical{String} ["t2m", "prec"] ReverseOrdered
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, String} with 2 entries:
  "units"     => "mm"
  "reference" => "your references"
├─────────────────────────────────────────────────────────────────── file size ┤
  file size: 800.0 bytes
└──────────────────────────────────────────────────────────────────────────────┘

or simply the one that does not share all dimensions

julia
julia> Cube(ds[["num"]])
╭────────────────────────────────╮
10-element YAXArray{Float64,1}
├────────────────────────────────┴────────────────────────────── dims ┐
Dim_1 Sampled{Int64} Base.OneTo(10) ForwardOrdered Regular Points
├─────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any}()
├────────────────────────────────────────────────────────── file size ┤
  file size: 80.0 bytes
└─────────────────────────────────────────────────────────────────────┘

Variable properties

Access to variables properties is done via

julia
Cube(ds[["t2m"]]).properties
Dict{String, String} with 2 entries:
  "units"     => "K"
  "reference" => "your references"

and

julia
Cube(ds[["prec"]]).properties
Dict{String, String} with 2 entries:
  "units"     => "mm"
  "reference" => "your references"

Note also that the global properties for the Dataset are accessed with

julia
ds.properties
Dict{String, String} with 2 entries:
  "reference" => "your global references"
  "space"     => "lon/lat"

Saving and different chunking modes are discussed here.