You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are so many open issues, it can feel quite intimidating! This freezes action and creates a vicious cycle where more issues leads to more paralysis, which leads to more issues.
I asked claude sonnet 4.6 to look at our issues and suggest what could be done to close large numbers of them. Here's the analysis:
Top 10 Bugfixes/Improvements to Close the Most Open Issues on Unidata/netcdf-c
Analysis of all 285 open issues on Unidata/netcdf-c, clustered by root cause. Issues often span multiple categories, so a single fix in a high-impact area has outsized effect.
Fix: Rewrite NCDEFAULT_get/put_vars to use chunk-aware bulk I/O instead of element-by-element dispatch. Also reconcile HDF5 vs netCDF stride semantics for unlimited dimensions.
Fix: CMake modernization (#2713) using proper targets, find_package configs, and generator expressions. Fix Windows symbol exports with a single .def file or __declspec audit. This alone would close 30+ issues.
Fix: Zarr V2 spec compliance audit + Xarray interop testing. Most of these are metadata-handling bugs in libnczarr. A systematic pass through the Zarr V2 spec would close ~15 issues.
4. Fix DAP2/DAP4 Client Bugs (~24 issues)
Categories: DAP/OPeNDAP (24), overlaps with ncdump, authentication
Fix: (a) Rewrite the cookie/auth handling to use libcurl's cookie jar properly. (b) Fix DAP4 string/attribute parsing. These two fixes would close ~15 DAP issues.
5. Fix Big-Endian / s390x Support (~10 issues, blocks CI)
Categories: Big-Endian (10), overlaps with test infrastructure
Fix: Add a big-endian CI workflow (#3282) and fix the ncx.m4 byte-swap code. Most of these are the same root cause — untested byte-swap paths. A CI + ncx.m4 fix would close all 10.
6. Fix VLEN/Compound Type Handling (~9 issues)
Categories: VLEN/Compound (9), overlaps with memory safety
VLEN types are a persistent source of crashes and data corruption:
Fix: Audit the VLEN reclaim/allocation paths in libhdf5 and libdispatch. The crashes (#2181, #2496) and the charvlenbug (#2160) likely share a root cause in how VLEN memory is managed during read-back with unlimited dimensions.
Fix: Add bounds checking and NULL guards in hdf5open.c:get_attached_info() and hdf5var.c:NC4_get_vars(). Fix the hashmap leak. This is ~4 functions that account for 8+ crash reports.
Fix: Centralize plugin path resolution — one function that checks HDF5_PLUGIN_PATH, configure-time path, and install-time path in order. Fix the CMake find_package for optional compression libs. Would close ~12 issues.
9. Thread Safety (~7 issues, high user impact)
Categories: Thread Safety (7)
Thread safety has been requested since 2017 (#382) and remains unfixed:
Fix: Implement per-thread HDF5 error stack isolation and audit global state in libdispatch. Even partial thread safety (read-only concurrent access) would satisfy most users and close 5+ issues.
Total unique issues addressable: ~130-150 out of 285 (many issues span multiple categories, so the raw sum double-counts). The CMake/Windows cluster (#2) and NCZarr (#3) are the two highest-leverage targets by sheer volume.
There are so many open issues, it can feel quite intimidating! This freezes action and creates a vicious cycle where more issues leads to more paralysis, which leads to more issues.
I asked claude sonnet 4.6 to look at our issues and suggest what could be done to close large numbers of them. Here's the analysis:
Top 10 Bugfixes/Improvements to Close the Most Open Issues on Unidata/netcdf-c
Analysis of all 285 open issues on Unidata/netcdf-c, clustered by root cause. Issues often span multiple categories, so a single fix in a high-impact area has outsized effect.
1. Fix
nc_get/put_varsStride Performance (~15 issues)Categories: Performance, Windows perf, nccopy slowness
The
NCDEFAULT_get/put_varscode path is the single most complained-about performance bottleneck. It causes:nccopyregressions between versions ('nccopy' much slower in 4.7.4 vs. 4.6.1 #1947, Using nccopy, setting deflate level to 0 ignores chunking specification #391)nc_openslowdowns (Slowernc_openwith version 4.9.3 #3183)NC_SHAREperformance changes (Performance change with NC_SHARE #1773)Fix: Rewrite
NCDEFAULT_get/put_varsto use chunk-aware bulk I/O instead of element-by-element dispatch. Also reconcile HDF5 vs netCDF stride semantics for unlimited dimensions.2. Modernize CMake Build System & Fix Windows/MSVC Builds (~35 unique issues)
Categories: CMake (22), Windows/MSVC (20), Static/Linking (19) — heavy overlap
The build system is the #1 source of user frustration. Recurring themes:
_MSC_VERvs_WIN32misuse breaks MinGW (I/O issues in mingw due to _MSC_VER misuse #1105, _MSC_VER -> _WIN32 #1108)TTL_LIBSdebug/optimized keyword issue ([CMake] Variable TTL_LIBS debug/optimized keywords are getting eaten #1579)Fix: CMake modernization (#2713) using proper targets,
find_packageconfigs, and generator expressions. Fix Windows symbol exports with a single.deffile or__declspecaudit. This alone would close 30+ issues.3. Fix NCZarr Interoperability & Correctness (~25 issues)
Categories: NCZarr/Zarr (25), overlaps with S3 and filters
NCZarr is the newest major feature and has the most open bugs per feature:
noxarraymode (Dimension names are lost in mode nczarr,noxarray #2647)Fix: Zarr V2 spec compliance audit + Xarray interop testing. Most of these are metadata-handling bugs in
libnczarr. A systematic pass through the Zarr V2 spec would close ~15 issues.4. Fix DAP2/DAP4 Client Bugs (~24 issues)
Categories: DAP/OPeNDAP (24), overlaps with ncdump, authentication
DAP issues cluster into three sub-problems:
/tmp/occookie*files), Cookie file cannot be read and written: (null) #1827Fix: (a) Rewrite the cookie/auth handling to use libcurl's cookie jar properly. (b) Fix DAP4 string/attribute parsing. These two fixes would close ~15 DAP issues.
5. Fix Big-Endian / s390x Support (~10 issues, blocks CI)
Categories: Big-Endian (10), overlaps with test infrastructure
Every release breaks on big-endian:
ncx.m4byte-swap bugs (Bug in ncx.m4 only exposed under certain (big-endian) build systems #3286)byteswap8static declaration conflict (4.7.4 fails to build on big endian architectures (error: static declaration of ‘byteswap8’ follows non-static declaration) #1687)Fix: Add a big-endian CI workflow (#3282) and fix the
ncx.m4byte-swap code. Most of these are the same root cause — untested byte-swap paths. A CI +ncx.m4fix would close all 10.6. Fix VLEN/Compound Type Handling (~9 issues)
Categories: VLEN/Compound (9), overlaps with memory safety
VLEN types are a persistent source of crashes and data corruption:
charvlenbugtest failure (Error with new "charvlenbug" test #2160)Fix: Audit the VLEN reclaim/allocation paths in
libhdf5andlibdispatch. The crashes (#2181, #2496) and thecharvlenbug(#2160) likely share a root cause in how VLEN memory is managed during read-back with unlimited dimensions.7. Harden Memory Safety in
libhdf5(~14 issues)Categories: Memory/Crash/Segfault (14)
Multiple fuzzer-found and user-reported crashes:
get_attached_infoalone (SEGV in get_attached_info netcdf/libhdf5/hdf5open.c #2664, heap-buffer-overflow in get_attached_info netcdf/libhdf5/hdf5open.c #2666, memcpy-param-overlap in get_attached_info netcdf/libhdf5/hdf5open.c #2667, heap-buffer-overflow in NC4_get_vars netcdf/libhdf5/hdf5var.c #2668) — heap overflows, SEGV, memcpy overlapFix: Add bounds checking and NULL guards in
hdf5open.c:get_attached_info()andhdf5var.c:NC4_get_vars(). Fix the hashmap leak. This is ~4 functions that account for 8+ crash reports.8. Fix Filter/Plugin Path & Discovery (~25 issues)
Categories: Filter/Plugin/Compression (25)
Plugin handling is broken in multiple ways:
nccopycan't find filter plugins (filter plugin path with nccopy #3048)make installcan't write to plugin dir (Make install cannot write to plugin directory #2381)Fix: Centralize plugin path resolution — one function that checks
HDF5_PLUGIN_PATH, configure-time path, and install-time path in order. Fix the CMakefind_packagefor optional compression libs. Would close ~12 issues.9. Thread Safety (~7 issues, high user impact)
Categories: Thread Safety (7)
Thread safety has been requested since 2017 (#382) and remains unfixed:
Fix: Implement per-thread HDF5 error stack isolation and audit global state in
libdispatch. Even partial thread safety (read-only concurrent access) would satisfy most users and close 5+ issues.10. Documentation Overhaul (~14 issues)
Categories: Documentation (14)
Recent audit found massive gaps:
libdap4,libdap2,libsrc,liblib(much missing documentation in libsrcp #3274, documentation issues in liblib and examples #3275, missing documentation in libdap2 #3278, missing doxygen in libdap4 #3280)libhdf5/libhdf4(doc typos and minor mistakes in libhdf5 and libhdf4 #3267, documentation typos and minor mistakes in netcdf-4 doxygen docs #3266)Fix: A systematic doxygen pass through the public API headers + fixing the doc build system (#2581) would close all 14 in one effort.
Summary Table
ncx.m4fixlibhdf5Total unique issues addressable: ~130-150 out of 285 (many issues span multiple categories, so the raw sum double-counts). The CMake/Windows cluster (#2) and NCZarr (#3) are the two highest-leverage targets by sheer volume.