netcdf simple_example {
dimensions:
x = 10;
y = 10;
variables:
float data(x, y);
data:
data = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30,
31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
51, 52, 53, 54, 55, 56, 57, 58, 59, 60,
61, 62, 63, 64, 65, 66, 67, 68, 69, 70,
71, 72, 73, 74, 75, 76, 77, 78, 79, 80,
81, 82, 83, 84, 85, 86, 87, 88, 89, 90,
91, 92, 93, 94, 95, 96, 97, 98, 99, 100 ;
}
Slower
nc_openwith version 4.9.3Hello, thanks for your work on this wonderful package, and for the heroic efforts getting 4.9.3 onto conda-forge 😊
The
nc_open()function looks to be significantly slower in version 4.9.3 versus 4.9.2.Recorded times on my machine
400microseconds slower;2100versus1700A1B_north_america.nc:5milliseconds slower;15versus10dummy.ncCDLExpand for CDL
Why these small slowdowns still matter
SciTools/iris provides parallel computation+streaming of NetCDF files in distributed compute setups. This has seen successful operational use for several years (ping @bouweandela, @valeriupredoi). For this use-case, the small slowdowns seen here scale to multiple seconds for parallel computation of 1000 chunks or more.
Explanation
The distributed compute nodes must not simultaneously open a single NetCDF file for writing, since that causes concurrency errors at the HDF5 level. Such errors can be prevented with appropriate use of locking, and ensuring that each parallel task opens and closes the NetCDF file independently - i.e. the file must not be left open after a parallel task completes. Thus, 1000 chunks = 1000 calls to
nc_open()= 1000x slowdown.I have tried to mitigate this slowdown by having tasks 'share' an already open file handle (with careful locking), but I am not aware1 of a way to achieve this 'sharing' between distributed compute nodes.
Info for reproducing
NetCDF config
Expand for `nc-config --all` output
Environment
OS: RHEL9
gccinfo:gcc (conda-forge gcc 15.2.0-7) 15.2.0Expand for Conda environment spec
Code
(With help from Copilot as my C skills are limited).
Compiling
gcc -o ncopen_bench ncopen_bench.c -lnetcdfFootnotes
We are working at the Python level. ↩