Skip to content

Interoperability with Xarray Zarr #3214

@dopplershift

Description

@dopplershift

It would be really nice if we could interoperate with zarr coming from elsewhere, like xarray. Currently, if I run the following Python code to create the simplest zarr store:

import numpy as np
import pandas as pd

import xarray as xr

ds = xr.Dataset(
    {"foo": (("x", "y"), np.random.rand(4, 5))},
    coords={
        "x": [10, 20, 30, 40],
        "y": pd.date_range("2000-01-01", periods=5),
        "z": ("x", list("abcd")),
    },
)
# same result for format 2 or 3
ds.to_zarr("test.zarr", zarr_format=2, consolidated=False, mode="w")

xarray fails to ingest it (xr.open_dataset("test.zarr", engine="netcdf4")), but more importantly I can't get ncdump -h to even recognize it properly.

If I do ncdump -h test.zarr, I get ncdump: test.zarr: NetCDF: Unknown file format.

Digging further, I tried ncdump -h file://test.zarr#mode=zarr,file, which at least seems to trigger the Zarr support, but I still get ncdump: file://test.zarr#mode=zarr,file: NetCDF: NCZarr error.

My first ask: can we please, please try to get some auto-detection working for Zarr here? Going to a full URI using custom fragments for a local file/directory seems pretty rough from a UX perspective.

Secondly, is there anything we can loosen so this works?

Metadata

Metadata

Assignees

Labels

area/nczarrnczarr related topics.

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions