Skip to content

Saving YAXArrays and Datasets

Is possible to save datasets and YAXArray directly to zarr files.

Saving a YAXArray to Zarr

One can save any YAXArray using the savecube function. Simply add a path as an argument and the cube will be saved.

julia
using YAXArrays, Zarr
a = YAXArray(rand(10,20))
savecube(a, "our_yax.zarr", driver=:zarr)

Saving a YAXArray to NetCDF

Saving to NetCDF works exactly the same way:

julia
using YAXArrays, Zarr, NetCDF
a = YAXArray(rand(10,20))
savecube(a, "our_yax.nc", driver=:netcdf)

Saving a Dataset

Saving Datasets can be done using the savedataset function.

julia
using YAXArrays, Zarr
ds = Dataset(x = YAXArray(rand(10,20)), y = YAXArray(rand(10)))
f = "our_dataset.zarr"
savedataset(ds, path=f, driver=:zarr)

Overwriting a Dataset

If a path already exists, an error will be thrown. Set overwrite=true to delete the existing dataset

julia
savedataset(ds, path=f, driver=:zarr, overwrite=true)

DANGER

Again, setting overwrite will delete all your previous saved data.

Look at the doc string for more information

# YAXArrays.Datasets.savedatasetFunction.

savedataset(ds::Dataset; path = "", persist = nothing, overwrite = false, append = false, skeleton=false, backend = :all, driver = backend, max_cache = 5e8, writefac=4.0)

Saves a Dataset into a file at path with the format given by driver, i.e., driver=:netcdf or driver=:zarr.

Warning

overwrite = true, deletes ALL your data and it will create a new file.

source


Appending to a Dataset

New variables can be added to an existing dataset using the append=true keyword.

julia
ds2 = Dataset(z = YAXArray(rand(10,20,5)))
savedataset(ds2, path=f, backend=:zarr, append=true)
julia
julia> open_dataset(f, driver=:zarr)
YAXArray Dataset
Shared Axes:
Dim_1 Sampled{Int64} 1:1:10 ForwardOrdered Regular Points
Variables:

x
Dim_2 Sampled{Int64} 1:1:20 ForwardOrdered Regular Points
z
Dim_2 Sampled{Int64} 1:1:20 ForwardOrdered Regular Points,
Dim_3 Sampled{Int64} 1:1:5 ForwardOrdered Regular Pointsy,

Datacube Skeleton without the actual data

Sometimes one merely wants to create a datacube "Skeleton" on disk and gradually fill it with data. Here we make use of FillArrays to create a YAXArray and write only the axis data and array metadata to disk, while no actual array data is copied:

julia
using YAXArrays, Zarr, FillArrays

create the Zeros array

julia
julia> a = YAXArray(Zeros(Union{Missing, Int32}, 10, 20))
╭─────────────────────────────────────────╮
10×20 YAXArray{Union{Missing, Int32},2}
├─────────────────────────────────────────┴────────────────────── dims ┐
Dim_1 Sampled{Int64} Base.OneTo(10) ForwardOrdered Regular Points,
Dim_2 Sampled{Int64} Base.OneTo(20) ForwardOrdered Regular Points
├──────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any}()
├─────────────────────────────────────────────────────────── file size ┤
  file size: 800.0 bytes
└──────────────────────────────────────────────────────────────────────┘

and save them as

julia
r = savecube(a, "skeleton.zarr", driver=:zarr, skeleton=true)

and check that all the values are missing

julia
all(ismissing,r[:,:])
true

If using FillArrays is not possible, using the zeros function works as well, though it does allocate the array in memory.

INFO

The skeleton argument is also available for savedataset.