Skip to content

How to apply functions on YAXArrays

To apply user defined functions on a YAXArray data type we can use the map function, mapslices function or the mapCube function. Which of these functions should be used depends on the layout of the data that the user defined function should be applied on.

Apply a function on every element of a datacube

The map function can be used to apply a function on every entry of a YAXArray without taking the dimensions into account. This will lazily register the mapped function which is applied when the YAXArray is either accessed or when more involved computations are made.

If we set up a dummy data cube which has all numbers between 1 and 10000.

julia
using YAXArrays
using DimensionalData
axes = (Dim{:Lon}(1:10), Dim{:Lat}(1:10), Dim{:Time}(1:100))
original = YAXArray(axes, reshape(1:10000, (10,10,100)))

with one at the first position:

julia
julia> original[1,:,1]
╭──────────────────────────────╮
10-element YAXArray{Int64,1}
├──────────────────────────────┴──────────────────── dims ┐
Lat Sampled{Int64} 1:10 ForwardOrdered Regular Points
├─────────────────────────────────────────────── metadata ┤
  Dict{String, Any}()
├────────────────────────────────────────────── file size ┤
  file size: 80.0 bytes
└─────────────────────────────────────────────────────────┘

now we can substract 1 from all elements of this cube

julia
julia> substracted = map(x-> x-1, original)
╭─────────────────────────────╮
10×10×100 YAXArray{Int64,3}
├─────────────────────────────┴─────────────────────── dims ┐
Lon  Sampled{Int64} 1:10 ForwardOrdered Regular Points,
Lat  Sampled{Int64} 1:10 ForwardOrdered Regular Points,
Time Sampled{Int64} 1:100 ForwardOrdered Regular Points
├───────────────────────────────────────────────── metadata ┤
  Dict{String, Any}()
├──────────────────────────────────────────────── file size ┤
  file size: 78.12 KB
└───────────────────────────────────────────────────────────┘

substracted is a cube of the same size as original, and the applied function is registered, so that it is applied as soon as the elements of substracted are either accessed or further used in other computations.

julia
julia> substracted[1,:,1]
╭──────────────────────────────╮
10-element YAXArray{Int64,1}
├──────────────────────────────┴──────────────────── dims ┐
Lat Sampled{Int64} 1:10 ForwardOrdered Regular Points
├─────────────────────────────────────────────── metadata ┤
  Dict{String, Any}()
├────────────────────────────────────────────── file size ┤
  file size: 80.0 bytes
└─────────────────────────────────────────────────────────┘

Apply a function along dimensions of a single cube

If an function should work along a certain dimension of the data you can use the mapslices function to easily apply this function. This doesn't give you the flexibility of the mapCube function but it is easier to use for simple functions.

If we set up a dummy data cube which has all numbers between 1 and 10000.

julia
julia> axes = (Dim{:Lon}(1:10), Dim{:Lat}(1:10), Dim{:Time}(1:100))
Lon  1:10,
Lat  1:10,
Time 1:100
julia
julia> original = YAXArray(axes, reshape(1:10000, (10,10,100)))
╭─────────────────────────────╮
10×10×100 YAXArray{Int64,3}
├─────────────────────────────┴─────────────────────── dims ┐
Lon  Sampled{Int64} 1:10 ForwardOrdered Regular Points,
Lat  Sampled{Int64} 1:10 ForwardOrdered Regular Points,
Time Sampled{Int64} 1:100 ForwardOrdered Regular Points
├───────────────────────────────────────────────── metadata ┤
  Dict{String, Any}()
├──────────────────────────────────────────────── file size ┤
  file size: 78.12 KB
└───────────────────────────────────────────────────────────┘

and then we would like to compute the sum over the Time dimension:

julia
julia> timesum = mapslices(sum, original, dims="Time")
"Running nonthreaded" = "Running nonthreaded"
╭─────────────────────────────────────────╮
10×10 YAXArray{Union{Missing, Int64},2}
├─────────────────────────────────────────┴────────── dims ┐
Lon Sampled{Int64} 1:10 ForwardOrdered Regular Points,
Lat Sampled{Int64} 1:10 ForwardOrdered Regular Points
├──────────────────────────────────────────────── metadata ┤
  Dict{String, Any}()
├─────────────────────────────────────────────── file size ┤
  file size: 800.0 bytes
└──────────────────────────────────────────────────────────┘

this reduces over the time dimension and gives us the following values

julia
julia> timesum[:,:]
╭─────────────────────────────────────────╮
10×10 YAXArray{Union{Missing, Int64},2}
├─────────────────────────────────────────┴────────── dims ┐
Lon Sampled{Int64} 1:10 ForwardOrdered Regular Points,
Lat Sampled{Int64} 1:10 ForwardOrdered Regular Points
├──────────────────────────────────────────────── metadata ┤
  Dict{String, Any}()
├─────────────────────────────────────────────── file size ┤
  file size: 800.0 bytes
└──────────────────────────────────────────────────────────┘

You can also apply a function along multiple dimensions of the same data cube.

julia
julia> lonlatsum = mapslices(sum, original, dims=("Lon", "Lat"))
"Running nonthreaded" = "Running nonthreaded"
╭───────────────────────────────────────────────╮
100-element YAXArray{Union{Missing, Int64},1}
├───────────────────────────────────────────────┴───── dims ┐
Time Sampled{Int64} 1:100 ForwardOrdered Regular Points
├───────────────────────────────────────────────── metadata ┤
  Dict{String, Any}()
├──────────────────────────────────────────────── file size ┤
  file size: 800.0 bytes
└───────────────────────────────────────────────────────────┘

Multiple input cubes to a function

TODO