Write Zarr strings as VLen-UTF8 by Bisaloo · Pull Request #452 · scverse/anndataR

Bisaloo · 2026-04-29T15:14:23Z

Related to:

Description

This follows the anndata file format spec
This allows compatibility with zarr python for zarr version 3. In particular, this is blocking the addition of writing as zarr v3 because roundtrip with python fail.

Checklist

Before review

~~Update and regenerate man pages~~
Add/update tests
~~Add/update examples in vignettes~~
Pass CI checks

Before merge

Update NEWS
Bump devel version

- This follows the anndata file format spec - This allows compatibility with zarr python for zarr version 3

Bisaloo · 2026-04-29T15:15:13Z

+  if (zarr_version == 3L) {
+    zarr_json_path <- file.path(store, name, "zarr.json")
+    zarr_json <- jsonlite::read_json(zarr_json_path)
+    zarr_json$data_type <- "string"
+    # There should be only one bytes-array codec
+    zarr_json$codecs <- lapply(
+      zarr_json$codecs,
+      function(codec) {
+         if (codec$name == "bytes") {
+           list(name = "vlen-utf8")
+         } else {
+           codec
+         }
+      }
+    )
+    jsonlite::write_json(
+      zarr_json, 
+      zarr_json_path, 
+      auto_unbox = TRUE,
+      pretty = TRUE,
+      null = "null"
+    )
+  }


This is not strictly need for now because writing is only possible for zarr v2, but this is laying the groundwork for an incoming PR from @Artur-man.

lazappi

I think this looks good. I'm just wondering if we should wait for a fix upstream rather than add a workaround here?

If we do it here, maybe some of the repeated code could be moved to a helper function?

Bisaloo · 2026-05-06T20:23:17Z

I'm just wondering if we should wait for a fix upstream rather than add a workaround here?

In theory, yes, I agree.

The reality is that I'm spread quite thin over a large number of projects and extending the writing capabilities of Rarr (as opposed to reading) is somewhat lower priority at the moment.

I could submit a follow up PR once the new interface is in place (I'd love to get to it before the next release but I cannot say for sure).

Artur-man · 2026-05-06T20:59:04Z

You guys let me know how you wanna move forward.

I can also write on top these commits of Hugo and open the zarr v3 write PR immediately, thus you do not need to work on it much @Bisaloo.

Artur-man · 2026-05-08T12:49:08Z

I will open a PR on top of @Bisaloo's commits, lets get this rolling ....

Bisaloo added 3 commits April 24, 2026 17:00

Write strings as VLen-UTF8

682c489

- This follows the anndata file format spec - This allows compatibility with zarr python for zarr version 3

Ensure only one array-bytes codec is present

21a5f97

Bump minimal required Rarr version

3071e37

Bisaloo commented Apr 29, 2026

View reviewed changes

lazappi reviewed May 6, 2026

View reviewed changes

lazappi changed the title ~~Write strings as VLen-UTF8~~ Write Zarr strings as VLen-UTF8 May 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Write Zarr strings as VLen-UTF8#452

Write Zarr strings as VLen-UTF8#452
Bisaloo wants to merge 3 commits intoscverse:develfrom
Bisaloo:zarr-vlen-utf8

Bisaloo commented Apr 29, 2026

Uh oh!

Bisaloo Apr 29, 2026

Uh oh!

lazappi left a comment

Uh oh!

Bisaloo commented May 6, 2026

Uh oh!

Artur-man commented May 6, 2026

Uh oh!

Artur-man commented May 8, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

Bisaloo commented Apr 29, 2026

Description

Checklist

Uh oh!

Bisaloo Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

lazappi left a comment

Choose a reason for hiding this comment

Uh oh!

Bisaloo commented May 6, 2026

Uh oh!

Artur-man commented May 6, 2026

Uh oh!

Artur-man commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Artur-man commented May 8, 2026 •

edited

Loading