Skip to content

apply_transform_to_tczyx_and_save / process_single_position incompatible with sharded v0.5 stores #404

@ieivanov

Description

@ieivanov

Summary (revised)

iohub.ngff.utils.apply_transform_to_tczyx_and_save / process_single_position write results via arr._impl.write_oindex(...) in _save_transformed (src/iohub/ngff/utils.py:190). On OME-Zarr v0.5 (sharded) stores this can surface an upstream zarr-python bug with a shape-mismatch in the sharding codec merge step.

An earlier draft of this issue implied that any sharded-v0.5 write path was broken. That was wrong. iohub's default code path is fine — the bug only surfaces in one specific corner case (details below).

What actually triggers the bug

iohub pins zarrs>=0.2.3 as a required dependency, so zarrs.ZarrsCodecPipeline is the active codec pipeline by default. ZarrsCodecPipeline handles most oindex writes into sharded arrays correctly. But it does not support every indexer pattern — when it can't lower a write to Rust, it raises UnsupportedVIndexingError and falls back to zarr-python's BatchedCodecPipeline. The sharding codec in the fallback pipeline has the bug tracked in zarr-developers/zarr-python#2834.

Concrete behavior on a sharded v0.5 store, measured under iohub's default config:

Indexer pattern on oindex[...] = value Result
Unique, contiguous indices, e.g. ([0, 1], [0]) ✅ works
Unique, non-contiguous indices, e.g. ([0, 2], [0]) ✅ works
Duplicate indices, e.g. ([0, 0], [0]) ValueError: shape mismatch

So the failure surface is narrow: an oindex write whose indexer contains duplicates (or hits one of the other patterns listed in the zarrs README that fall back to the default pipeline) on a sharded Zarr v3 array.

How this surfaced in tests

tests/ngff/test_ngff_utils.py::test_apply_transform_to_tczyx_and_save was failing intermittently under the new #401 defaults. Root cause:

  1. The hypothesis strategy for time_indices / channel_indices doesn't set unique=True, so it sometimes draws duplicate integers (e.g. [0, 0], [2, 2, 2]).
  2. Set version-specific default chunk and shard sizes in create_empty_plate #401 changes create_empty_plate to default to sharded v0.5 stores, so the output array now has a sharding codec.
  3. With duplicate indices + sharding, the write falls back to BatchedCodecPipeline and hits zarr-python#2834.

It wasn't caught on PR #403's CI because max_examples=5 — with 5 random draws, duplicate-index cases sometimes weren't sampled. Locally, once hypothesis's database had recorded a failing example, it replayed it deterministically; a bump to max_examples=200 reproduces the failure reliably.

Upstream

Tracked at zarr-developers/zarr-python#2834"bug with setitem with oindex and sharding". Still open; a WIP fix exists but hasn't landed.

Options

  1. Narrow the test strategy — set unique=True on time_indices and channel_indices in plate_setup / apply_transform_czyx_setup. Semantically, writing the same transformed result to the same index twice isn't a real use case — this just aligns the strategy with what callers actually do. Low-risk, unblocks Set version-specific default chunk and shard sizes in create_empty_plate #401.
  2. Guard _save_transformed — detect duplicate indices and either deduplicate them or fall back to a per-(t, c) basic-slice write. Defensive, slight overhead, handles the case correctly even if a user hits it.
  3. Wait on upstream — do nothing iohub-side, track zarr-python#2834.

Recommendation: (1) is sufficient for unblocking #401. (2) is worth doing if we want iohub to behave correctly even for pathological caller inputs.

Repro (isolates the duplicate-index fallback)

import zarr, numpy as np, tempfile
from pathlib import Path

with zarr.config.set({"codec_pipeline.path": "zarrs.ZarrsCodecPipeline"}):
    with tempfile.TemporaryDirectory() as d:
        z = zarr.create_array(
            store=zarr.storage.LocalStore(str(Path(d) / "z.zarr")),
            shape=(4, 1, 4, 16, 16),
            chunks=(1, 1, 4, 16, 16),
            shards=(2, 1, 4, 16, 16),
            dtype="f4",
        )
        z[:] = 0
        # Unique indices — zarrs handles it
        z.oindex[[0, 1], [0]] = np.ones((2, 1, 4, 16, 16), dtype="f4")  # OK
        # Duplicate indices — zarrs falls back to the buggy pipeline
        z.oindex[[0, 0], [0]] = np.ones((2, 1, 4, 16, 16), dtype="f4")  # ValueError

Metadata

Metadata

Assignees

No one assigned

    Labels

    area: zarrZarr integration, storage backends (zarr-python/tensorstore/zarrs)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions