Skip to content

Write YAXArrays and Datasets

Create an example Dataset:

julia
using YAXArrays
using NetCDF
using Downloads: download

path = download("https://www.unidata.ucar.edu/software/netcdf/examples/tos_O1_2001-2002.nc", "example.nc")
ds = open_dataset(path)
YAXArray Dataset
Shared Axes: 
↓ lon Sampled{Float64} 1.0:2.0:359.0 ForwardOrdered Regular Points,
→ lat Sampled{Float64} -79.5:1.0:89.5 ForwardOrdered Regular Points,
↗ Ti  Sampled{CFTime.DateTime360Day} [CFTime.DateTime360Day(2001-01-16T00:00:00), …, CFTime.DateTime360Day(2002-12-16T00:00:00)] ForwardOrdered Irregular Points
Variables: 
tos

Properties: Dict{String, Any}("cmor_version" => 0.96f0, "references" => "Dufresne et al, Journal of Climate, 2015, vol XX, p 136", "realization" => 1, "Conventions" => "CF-1.0", "contact" => "Sebastien Denvil, sebastien.denvil@ipsl.jussieu.fr", "history" => "YYYY/MM/JJ: data generated; YYYY/MM/JJ+1 data transformed  At 16:37:23 on 01/11/2005, CMOR rewrote data to comply with CF standards and IPCC Fourth Assessment requirements", "table_id" => "Table O1 (13 November 2004)", "source" => "IPSL-CM4_v1 (2003) : atmosphere : LMDZ (IPSL-CM4_IPCC, 96x71x19) ; ocean ORCA2 (ipsl_cm4_v1_8, 2x2L31); sea ice LIM (ipsl_cm4_v", "title" => "IPSL  model output prepared for IPCC Fourth Assessment SRES A2 experiment", "experiment_id" => "SRES A2 experiment"…)

Write Zarr

Save a single YAXArray to a directory:

julia
using Zarr
savecube(ds.tos, "tos.zarr", driver=:zarr)

Save an entire Dataset to a directory:

julia
savedataset(ds, path="ds.zarr", driver=:zarr)

Write NetCDF

Save a single YAXArray to a directory:

julia
using NetCDF
savecube(ds.tos, "tos.nc", driver=:netcdf)

Save an entire Dataset to a directory:

julia
savedataset(ds, path="ds.nc", driver=:netcdf)

Overwrite a Dataset

If a path already exists, an error will be thrown. Set overwrite=true to delete the existing dataset

julia
savedataset(ds, path="ds.zarr", driver=:zarr, overwrite=true)

DANGER

Again, setting overwrite will delete all your previous saved data.

Look at the doc string for more information

# YAXArrays.Datasets.savedatasetFunction.

savedataset(ds::Dataset; path = "", persist = nothing, overwrite = false, append = false, skeleton=false, backend = :all, driver = backend, max_cache = 5e8, writefac=4.0)

Saves a Dataset into a file at path with the format given by driver, i.e., driver=:netcdf or driver=:zarr.

Warning

overwrite = true, deletes ALL your data and it will create a new file.

source


Append to a Dataset

New variables can be added to an existing dataset using the append=true keyword.

julia
ds2 = Dataset(z = YAXArray(rand(10,20,5)))
savedataset(ds2, path="ds.zarr", backend=:zarr, append=true)
julia
julia> open_dataset("ds.zarr", driver=:zarr)
YAXArray Dataset
Shared Axes:
()
Variables:
tos
lon Sampled{Float64} 1.0:2.0:359.0 ForwardOrdered Regular Points,
lat Sampled{Float64} -79.5:1.0:89.5 ForwardOrdered Regular Points,
Ti  Sampled{CFTime.DateTime360Day} [CFTime.DateTime360Day(2001-01-16T00:00:00), …, CFTime.DateTime360Day(2002-12-16T00:00:00)] ForwardOrdered Irregular Points
z
Dim_1 Sampled{Int64} 1:1:10 ForwardOrdered Regular Points,
Dim_2 Sampled{Int64} 1:1:20 ForwardOrdered Regular Points,
Dim_3 Sampled{Int64} 1:1:5 ForwardOrdered Regular Points

Properties: Dict{String, Any}("cmor_version" => 0.96, "references" => "Dufresne et al, Journal of Climate, 2015, vol XX, p 136", "realization" => 1, "contact" => "Sebastien Denvil, sebastien.denvil@ipsl.jussieu.fr", "Conventions" => "CF-1.0", "history" => "YYYY/MM/JJ: data generated; YYYY/MM/JJ+1 data transformed  At 16:37:23 on 01/11/2005, CMOR rewrote data to comply with CF standards and IPCC Fourth Assessment requirements", "table_id" => "Table O1 (13 November 2004)", "source" => "IPSL-CM4_v1 (2003) : atmosphere : LMDZ (IPSL-CM4_IPCC, 96x71x19) ; ocean ORCA2 (ipsl_cm4_v1_8, 2x2L31); sea ice LIM (ipsl_cm4_v", "title" => "IPSL  model output prepared for IPCC Fourth Assessment SRES A2 experiment", "experiment_id" => "SRES A2 experiment"…)

Save Skeleton

Sometimes one merely wants to create a datacube "Skeleton" on disk and gradually fill it with data. Here we make use of FillArrays to create a YAXArray and write only the axis data and array metadata to disk, while no actual array data is copied:

julia
using YAXArrays, Zarr, FillArrays

create the Zeros array

julia
julia> a = YAXArray(Zeros(Union{Missing, Int32}, 10, 20))
╭─────────────────────────────────────────╮
10×20 YAXArray{Union{Missing, Int32},2}
├─────────────────────────────────────────┴────────────────────── dims ┐
Dim_1 Sampled{Int64} Base.OneTo(10) ForwardOrdered Regular Points,
Dim_2 Sampled{Int64} Base.OneTo(20) ForwardOrdered Regular Points
├──────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any}()
├─────────────────────────────────────────────────────────── file size ┤
  file size: 800.0 bytes
└──────────────────────────────────────────────────────────────────────┘

and save them as

julia
r = savecube(a, "skeleton.zarr", driver=:zarr, skeleton=true)

and check that all the values are missing

julia
all(ismissing,r[:,:])
true

If using FillArrays is not possible, using the zeros function works as well, though it does allocate the array in memory.

INFO

The skeleton argument is also available for savedataset.