apply_transform_to_tczyx_and_save / process_single_position incompatible with sharded v0.5 stores

## Summary (revised)

`iohub.ngff.utils.apply_transform_to_tczyx_and_save` / `process_single_position` write results via `arr._impl.write_oindex(...)` in `_save_transformed` (`src/iohub/ngff/utils.py:190`). On OME-Zarr v0.5 (sharded) stores this can surface an upstream zarr-python bug with a shape-mismatch in the sharding codec merge step.

An earlier draft of this issue implied that any sharded-v0.5 write path was broken. That was wrong. **iohub's default code path is fine** — the bug only surfaces in one specific corner case (details below).

## What actually triggers the bug

iohub pins `zarrs>=0.2.3` as a required dependency, so `zarrs.ZarrsCodecPipeline` is the active codec pipeline by default. `ZarrsCodecPipeline` handles most `oindex` writes into sharded arrays correctly. But it does **not** support every indexer pattern — when it can't lower a write to Rust, it raises `UnsupportedVIndexingError` and falls back to zarr-python's `BatchedCodecPipeline`. The sharding codec in the fallback pipeline has the bug tracked in zarr-developers/zarr-python#2834.

Concrete behavior on a sharded v0.5 store, measured under iohub's default config:

| Indexer pattern on `oindex[...] = value` | Result |
| --- | --- |
| Unique, contiguous indices, e.g. `([0, 1], [0])` | ✅ works |
| Unique, non-contiguous indices, e.g. `([0, 2], [0])` | ✅ works |
| **Duplicate indices**, e.g. `([0, 0], [0])` | ❌ `ValueError: shape mismatch` |

So the failure surface is narrow: an `oindex` write whose indexer contains duplicates (or hits one of the other patterns listed in the [zarrs README](https://github.com/zarrs/zarrs-python/blob/main/README.md#supported-indexing-methods) that fall back to the default pipeline) on a sharded Zarr v3 array.

## How this surfaced in tests

`tests/ngff/test_ngff_utils.py::test_apply_transform_to_tczyx_and_save` was failing intermittently under the new #401 defaults. Root cause:

1. The hypothesis strategy for `time_indices` / `channel_indices` doesn't set `unique=True`, so it sometimes draws duplicate integers (e.g. `[0, 0]`, `[2, 2, 2]`).
2. #401 changes `create_empty_plate` to default to sharded v0.5 stores, so the output array now has a sharding codec.
3. With duplicate indices + sharding, the write falls back to `BatchedCodecPipeline` and hits zarr-python#2834.

It wasn't caught on PR #403's CI because `max_examples=5` — with 5 random draws, duplicate-index cases sometimes weren't sampled. Locally, once hypothesis's database had recorded a failing example, it replayed it deterministically; a bump to `max_examples=200` reproduces the failure reliably.

## Upstream

Tracked at zarr-developers/zarr-python#2834 — *"bug with setitem with oindex and sharding"*. Still open; a WIP fix exists but hasn't landed.

## Options

1. **Narrow the test strategy** — set `unique=True` on `time_indices` and `channel_indices` in `plate_setup` / `apply_transform_czyx_setup`. Semantically, writing the same transformed result to the same index twice isn't a real use case — this just aligns the strategy with what callers actually do. Low-risk, unblocks #401.
2. **Guard `_save_transformed`** — detect duplicate indices and either deduplicate them or fall back to a per-(t, c) basic-slice write. Defensive, slight overhead, handles the case correctly even if a user hits it.
3. **Wait on upstream** — do nothing iohub-side, track zarr-python#2834.

Recommendation: (1) is sufficient for unblocking #401. (2) is worth doing if we want iohub to behave correctly even for pathological caller inputs.

## Repro (isolates the duplicate-index fallback)

```python
import zarr, numpy as np, tempfile
from pathlib import Path

with zarr.config.set({"codec_pipeline.path": "zarrs.ZarrsCodecPipeline"}):
    with tempfile.TemporaryDirectory() as d:
        z = zarr.create_array(
            store=zarr.storage.LocalStore(str(Path(d) / "z.zarr")),
            shape=(4, 1, 4, 16, 16),
            chunks=(1, 1, 4, 16, 16),
            shards=(2, 1, 4, 16, 16),
            dtype="f4",
        )
        z[:] = 0
        # Unique indices — zarrs handles it
        z.oindex[[0, 1], [0]] = np.ones((2, 1, 4, 16, 16), dtype="f4")  # OK
        # Duplicate indices — zarrs falls back to the buggy pipeline
        z.oindex[[0, 0], [0]] = np.ones((2, 1, 4, 16, 16), dtype="f4")  # ValueError
```

Indexer pattern on `oindex[...] = value`	Result
Unique, contiguous indices, e.g. `([0, 1], [0])`	✅ works
Unique, non-contiguous indices, e.g. `([0, 2], [0])`	✅ works
Duplicate indices, e.g. `([0, 0], [0])`	❌ `ValueError: shape mismatch`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

apply_transform_to_tczyx_and_save / process_single_position incompatible with sharded v0.5 stores #404

Summary (revised)

What actually triggers the bug

How this surfaced in tests

Upstream

Options

Repro (isolates the duplicate-index fallback)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

apply_transform_to_tczyx_and_save / process_single_position incompatible with sharded v0.5 stores #404

Description

Summary (revised)

What actually triggers the bug

How this surfaced in tests

Upstream

Options

Repro (isolates the duplicate-index fallback)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions