Read YAXArrays and Datasets
This section describes how to read files, URLs, and directories into YAXArrays and datasets.
Read Zarr
Open a Zarr store as a Dataset
:
using YAXArrays
using Zarr
path="gs://cmip6/CMIP6/ScenarioMIP/DKRZ/MPI-ESM1-2-HR/ssp585/r1i1p1f1/3hr/tas/gn/v20190710/"
store = zopen(path, consolidated=true)
ds = open_dataset(store)
YAXArray Dataset
Shared Axes:
None
Variables:
height
Variables with additional axes:
Additional Axes:
(↓ lon Sampled{Float64} 0.0:0.9375:359.0625 ForwardOrdered Regular Points,
→ lat Sampled{Float64} [-89.28422753251364, -88.35700351866494, …, 88.35700351866494, 89.28422753251364] ForwardOrdered Irregular Points,
↗ Ti Sampled{DateTime} [2015-01-01T03:00:00, …, 2101-01-01T00:00:00] ForwardOrdered Irregular Points)
Variables:
tas
Properties: Dict{String, Any}("initialization_index" => 1, "realm" => "atmos", "variable_id" => "tas", "external_variables" => "areacella", "branch_time_in_child" => 60265.0, "data_specs_version" => "01.00.30", "history" => "2019-07-21T06:26:13Z ; CMOR rewrote data to be consistent with CMIP6, CF-1.7 CMIP-6.2 and CF standards.", "forcing_index" => 1, "parent_variant_label" => "r1i1p1f1", "table_id" => "3hr"…)
We can set path
to a URL, a local directory, or in this case to a cloud object storage path.
A zarr store may contain multiple arrays. Individual arrays can be accessed using subsetting:
ds.tas
╭────────────────────────────────────╮
│ 384×192×251288 YAXArray{Float32,3} │
├────────────────────────────────────┴─────────────────────────────────── dims ┐
↓ lon Sampled{Float64} 0.0:0.9375:359.0625 ForwardOrdered Regular Points,
→ lat Sampled{Float64} [-89.28422753251364, -88.35700351866494, …, 88.35700351866494, 89.28422753251364] ForwardOrdered Irregular Points,
↗ Ti Sampled{DateTime} [2015-01-01T03:00:00, …, 2101-01-01T00:00:00] ForwardOrdered Irregular Points
├──────────────────────────────────────────────────────────────────── metadata ┤
Dict{String, Any} with 10 entries:
"units" => "K"
"history" => "2019-07-21T06:26:13Z altered by CMOR: Treated scalar dime…
"name" => "tas"
"cell_methods" => "area: mean time: point"
"cell_measures" => "area: areacella"
"long_name" => "Near-Surface Air Temperature"
"coordinates" => "height"
"standard_name" => "air_temperature"
"_FillValue" => 1.0f20
"comment" => "near-surface (usually, 2 meter) air temperature"
├─────────────────────────────────────────────────────────────────── file size ┤
file size: 69.02 GB
└──────────────────────────────────────────────────────────────────────────────┘
Read NetCDF
Open a NetCDF file as a Dataset
:
using YAXArrays
using NetCDF
using Downloads: download
path = download("https://www.unidata.ucar.edu/software/netcdf/examples/tos_O1_2001-2002.nc", "example.nc")
ds = open_dataset(path)
YAXArray Dataset
Shared Axes:
(↓ lon Sampled{Float64} 1.0:2.0:359.0 ForwardOrdered Regular Points,
→ lat Sampled{Float64} -79.5:1.0:89.5 ForwardOrdered Regular Points,
↗ Ti Sampled{CFTime.DateTime360Day} [CFTime.DateTime360Day(2001-01-16T00:00:00), …, CFTime.DateTime360Day(2002-12-16T00:00:00)] ForwardOrdered Irregular Points)
Variables:
tos
Properties: Dict{String, Any}("cmor_version" => 0.96f0, "references" => "Dufresne et al, Journal of Climate, 2015, vol XX, p 136", "realization" => 1, "Conventions" => "CF-1.0", "contact" => "Sebastien Denvil, sebastien.denvil@ipsl.jussieu.fr", "history" => "YYYY/MM/JJ: data generated; YYYY/MM/JJ+1 data transformed At 16:37:23 on 01/11/2005, CMOR rewrote data to comply with CF standards and IPCC Fourth Assessment requirements", "table_id" => "Table O1 (13 November 2004)", "source" => "IPSL-CM4_v1 (2003) : atmosphere : LMDZ (IPSL-CM4_IPCC, 96x71x19) ; ocean ORCA2 (ipsl_cm4_v1_8, 2x2L31); sea ice LIM (ipsl_cm4_v", "title" => "IPSL model output prepared for IPCC Fourth Assessment SRES A2 experiment", "experiment_id" => "SRES A2 experiment"…)
A NetCDF file may contain multiple arrays. Individual arrays can be accessed using subsetting:
ds.tos
╭────────────────────────────────────────────────╮
│ 180×170×24 YAXArray{Union{Missing, Float32},3} │
├────────────────────────────────────────────────┴─────────────────────── dims ┐
↓ lon Sampled{Float64} 1.0:2.0:359.0 ForwardOrdered Regular Points,
→ lat Sampled{Float64} -79.5:1.0:89.5 ForwardOrdered Regular Points,
↗ Ti Sampled{CFTime.DateTime360Day} [CFTime.DateTime360Day(2001-01-16T00:00:00), …, CFTime.DateTime360Day(2002-12-16T00:00:00)] ForwardOrdered Irregular Points
├──────────────────────────────────────────────────────────────────── metadata ┤
Dict{String, Any} with 10 entries:
"units" => "K"
"missing_value" => 1.0f20
"history" => " At 16:37:23 on 01/11/2005: CMOR altered the data in t…
"cell_methods" => "time: mean (interval: 30 minutes)"
"name" => "tos"
"long_name" => "Sea Surface Temperature"
"original_units" => "degC"
"standard_name" => "sea_surface_temperature"
"_FillValue" => 1.0f20
"original_name" => "sosstsst"
├─────────────────────────────────────────────────────────────────── file size ┤
file size: 2.8 MB
└──────────────────────────────────────────────────────────────────────────────┘
Read GDAL (GeoTIFF, GeoJSON)
All GDAL compatible files can be read as a YAXArrays.Dataset
after loading ArchGDAL:
using YAXArrays
using ArchGDAL
using Downloads: download
path = download("https://github.com/yeesian/ArchGDALDatasets/raw/307f8f0e584a39a050c042849004e6a2bd674f99/gdalworkshop/world.tif", "world.tif")
ds = open_dataset(path)
YAXArray Dataset
Shared Axes:
(↓ X Sampled{Float64} -180.0:0.17578125:179.82421875 ForwardOrdered Regular Points,
→ Y Sampled{Float64} 90.0:-0.17578125:-89.82421875 ReverseOrdered Regular Points)
Variables:
Blue, Green, Red
Properties: Dict{String, Any}("projection" => "GEOGCS[\"WGS 84\",DATUM[\"WGS_1984\",SPHEROID[\"WGS 84\",6378137,298.257223563,AUTHORITY[\"EPSG\",\"7030\"]],AUTHORITY[\"EPSG\",\"6326\"]],PRIMEM[\"Greenwich\",0,AUTHORITY[\"EPSG\",\"8901\"]],UNIT[\"degree\",0.0174532925199433,AUTHORITY[\"EPSG\",\"9122\"]],AXIS[\"Latitude\",NORTH],AXIS[\"Longitude\",EAST],AUTHORITY[\"EPSG\",\"4326\"]]")