Saving YAXArrays and Datasets
Is possible to save datasets and YAXArray
directly to zarr
files.
Saving a YAXArray to Zarr
One can save any YAXArray
using the savecube
function. Simply add a path as an argument and the cube will be saved.
using YAXArrays, Zarr
a = YAXArray(rand(10,20))
savecube(a, "our_yax.zarr", driver=:zarr)
Saving a YAXArray to NetCDF
Saving to NetCDF works exactly the same way:
using YAXArrays, Zarr, NetCDF
a = YAXArray(rand(10,20))
savecube(a, "our_yax.nc", driver=:netcdf)
Saving a Dataset
Saving Datasets can be done using the savedataset
function.
using YAXArrays, Zarr
ds = Dataset(x = YAXArray(rand(10,20)), y = YAXArray(rand(10)))
f = "our_dataset.zarr"
savedataset(ds, path=f, driver=:zarr)
Overwriting a Dataset
If a path already exists, an error will be thrown. Set overwrite=true
to delete the existing dataset
savedataset(ds, path=f, driver=:zarr, overwrite=true)
DANGER
Again, setting overwrite
will delete all your previous saved data.
Look at the doc string for more information
savedataset(ds::Dataset; path = "", persist = nothing, overwrite = false, append = false, skeleton=false, backend = :all, driver = backend, max_cache = 5e8, writefac=4.0)
Saves a Dataset into a file at path
with the format given by driver
, i.e., driver=:netcdf or driver=:zarr.
Warning
overwrite = true, deletes ALL your data and it will create a new file.
Appending to a Dataset
New variables can be added to an existing dataset using the append=true
keyword.
ds2 = Dataset(z = YAXArray(rand(10,20,5)))
savedataset(ds2, path=f, backend=:zarr, append=true)
julia> open_dataset(f, driver=:zarr)
YAXArray Dataset
Shared Axes:
↓ Dim_1 Sampled{Int64} 1:1:10 ForwardOrdered Regular Points
Variables:
x
↓ Dim_2 Sampled{Int64} 1:1:20 ForwardOrdered Regular Points
z
↓ Dim_2 Sampled{Int64} 1:1:20 ForwardOrdered Regular Points,
→ Dim_3 Sampled{Int64} 1:1:5 ForwardOrdered Regular Pointsy,
Datacube Skeleton without the actual data
Sometimes one merely wants to create a datacube "Skeleton" on disk and gradually fill it with data. Here we make use of FillArrays
to create a YAXArray
and write only the axis data and array metadata to disk, while no actual array data is copied:
using YAXArrays, Zarr, FillArrays
create the Zeros
array
julia> a = YAXArray(Zeros(Union{Missing, Int32}, 10, 20))
╭─────────────────────────────────────────╮
│ 10×20 YAXArray{Union{Missing, Int32},2} │
├─────────────────────────────────────────┴────────────────────── dims ┐
↓ Dim_1 Sampled{Int64} Base.OneTo(10) ForwardOrdered Regular Points,
→ Dim_2 Sampled{Int64} Base.OneTo(20) ForwardOrdered Regular Points
├──────────────────────────────────────────────────────────── metadata ┤
Dict{String, Any}()
├─────────────────────────────────────────────────────────── file size ┤
file size: 800.0 bytes
└──────────────────────────────────────────────────────────────────────┘
and save them as
r = savecube(a, "skeleton.zarr", driver=:zarr, skeleton=true)
and check that all the values are missing
all(ismissing,r[:,:])
true
If using FillArrays
is not possible, using the zeros
function works as well, though it does allocate the array in memory.
INFO
The skeleton
argument is also available for savedataset
.