A custom OME-TIFF reader (iohub.multipagetiff.MicromanagerOmeTiffReader) was implemented because historically tifffile and AICSImageIO was slow when reading large OME-TIFF series generated by Micro-Manager acquisitions.
While debugging #65 I found that this implementation does not guarantee data integrity during reading. Before investing more time in fixing it, I think it is worth revisiting the topic of whether it is worth maintaining a custom OME-TIFF reader, given that the more widely adopted solutions have evolved since waveorder.io's designation.
Here is a simple read speed benchmark of tifffile and iohub's custom reader:

The test was done on a 123 GB dataset with TCZYX=(8, 9, 3, 81, 2048, 2048) dimensions. Voxels from 2 non-sequential positions was read into RAM in each iteration (N=5).
Test script (click to expand):
Environment: Python 3.10.8, Linux 4.18 (x86_64, AMD EPYC 7H12@2.6GHz)
# %%
import os
from timeit import timeit
import zarr
import pandas as pd
# readers tested
from tifffile import TiffSequence # 2023.2.3
from iohub.multipagetiff import MicromanagerOmeTiffReader # 0.1.dev368+g3d62e6f
# %%
# 123 GB total
DATASET = (
"/hpc/projects/comp_micro/rawdata/hummingbird/Soorya/"
"2022_06_27_A549cellMembraneStained/"
"A549_CellMaskDye_Well1_deltz0.25_63X_30s_2framemin/"
"A549_CellMaskdye_Well1_30s_2framemin_1"
)
POSITIONS = (2, 0)
# %%
def read_tifffile():
sequence = TiffSequence(os.scandir(DATASET))
data = zarr.open(sequence.aszarr(), mode="r")
for p in POSITIONS:
_ = data[p]
sequence.close()
# %%
def read_custom():
reader = MicromanagerOmeTiffReader(DATASET)
for p in POSITIONS:
_ = reader.get_array(p)
# %%
def repeat(n=5):
tf_times = []
wo_times = []
for _ in range(n):
tf_times.append(
timeit(
"read_tifffile()", number=1, setup="from __main__ import read_tifffile"
)
)
wo_times.append(
timeit("read_custom()", number=1, setup="from __main__ import read_custom")
)
return pd.DataFrame({"tifffile": tf_times, "waveorder": wo_times})
# %%
timings = repeat()
At least in this test, the latest tifffile consistently out-performs the iohub implementation. While a comprehensive benchmark will take more time (#57), I think as long as a widely used library is not significantly slower, the reduction of maintenance overhead and increased user testing can make a strong case for us to reconsider maintaining the custom code in iohub.
A custom OME-TIFF reader (
iohub.multipagetiff.MicromanagerOmeTiffReader) was implemented because historically tifffile and AICSImageIO was slow when reading large OME-TIFF series generated by Micro-Manager acquisitions.While debugging #65 I found that this implementation does not guarantee data integrity during reading. Before investing more time in fixing it, I think it is worth revisiting the topic of whether it is worth maintaining a custom OME-TIFF reader, given that the more widely adopted solutions have evolved since
waveorder.io's designation.Here is a simple read speed benchmark of tifffile and iohub's custom reader:
The test was done on a 123 GB dataset with TCZYX=(8, 9, 3, 81, 2048, 2048) dimensions. Voxels from 2 non-sequential positions was read into RAM in each iteration (N=5).
Test script (click to expand):
Environment: Python 3.10.8, Linux 4.18 (x86_64, AMD EPYC 7H12@2.6GHz)
At least in this test, the latest tifffile consistently out-performs the iohub implementation. While a comprehensive benchmark will take more time (#57), I think as long as a widely used library is not significantly slower, the reduction of maintenance overhead and increased user testing can make a strong case for us to reconsider maintaining the custom code in iohub.