Big-Endian CI Workflow for netCDF-C
Background and motivation
Most modern hardware — x86-64, ARM, RISC-V — is little-endian: multi-byte integers
are stored with the least-significant byte first. The netCDF-C library must also work
correctly on big-endian platforms, where the most-significant byte comes first. This
matters because:
-
netCDF file format correctness: The classic netCDF format stores data in
big-endian (XDR) byte order on disk regardless of the host architecture. The
conversion routines in libsrc/ncx.m4 (and the generated ncx.c) handle this
translation. Bugs in those routines may only manifest on big-endian hosts where
no byte-swapping is needed and different code paths are taken.
-
Real-world deployments: IBM z/Architecture (s390x) mainframes running Linux
are big-endian and are used in production scientific and enterprise environments.
netCDF-C should build and pass its test suite on these systems.
-
Latent bugs: Several bugs in netCDF-C are only exposed on big-endian platforms
because the little-endian code path is exercised far more often. Without a CI job
that runs on a big-endian host, these bugs go undetected until a user reports them.
One such bug was found during the development of this workflow: an undeclared
fillp variable in the ncx_getn_float_float and ncx_getn_double_double
fallback paths in libsrc/ncx.m4 (see bugs section below).
Since no big-endian hardware is available in GitHub Actions, we use QEMU software
emulation to run an s390x (IBM Z) environment on a standard x86-64 runner.
Current state
Workflow file: .github/workflows/run_tests_bigendian.yml
Final design
Trigger
workflow_dispatch (manual)
pull_request targeting main
Architecture
s390x (IBM z/Architecture, big-endian) via uraimo/run-on-arch-action@v2
distro: ubuntu22.04 inside the QEMU container
ppc64 and ubuntu24.04 are not supported by the action
HDF5
- Installed from apt (
libhdf5-dev, libhdf5-103-1) — not built from source
- Building from source fails because HDF5 configure tries to run compiled test binaries, which doesn't work under QEMU emulation
- HDF5 serial library path:
/usr/lib/s390x-linux-gnu/hdf5/serial
- Headers:
/usr/include/hdf5/serial
- Must pass
LIBS="-lhdf5_serial -lz" explicitly — the zlib check fails otherwise
Compiler
- Native
gcc/g++ installed inside the QEMU container (not cross-compilers)
- Do NOT use
--host=s390x-linux-gnu — the run: block executes natively inside the emulated s390x container, so no cross-compilation is needed
- Cross-compiler packages (
gcc-s390x-linux-gnu) caused GCC 11 ICEs under QEMU; native gcc is stable
Swap space
- 8GB swap added on the host runner before QEMU starts via
dd
- Required because QEMU emulation is memory-intensive and GCC crashes (ICE/segfault) without sufficient memory
Configure flags
--enable-hdf5
--disable-dap
--disable-dap-remote-tests
--disable-nczarr
--disable-libxml2
--disable-shared
--enable-static
--disable-utilities # skips ncgen/ncdump/ncrandom which triggered GCC ICEs
Make
-j 1 throughout (parallel builds triggered GCC ICEs under QEMU)
Bugs found and fixed during development
libsrc/ncx.m4 — fillp undeclared (fixed in master)
ncx_getn_float_float and ncx_getn_double_double #else fallback paths
called ncx_get_float_float(xp, tp, fillp) and ncx_get_double_double(xp, tp, fillp)
but fillp was not a parameter of the enclosing getn function
- Only exposed when the
#else branch is compiled (non-IEEE-754 or cross-compilation)
- Fixed by replacing
fillp with NULL in both calls
- See
/home/ed/ncx_m4_fillp_bug.md for full issue writeup
Architectures tried and rejected
ppc64 — not supported by uraimo/run-on-arch-action@v2
ubuntu24.04 — not supported by uraimo/run-on-arch-action@v2
bullseye (Debian) — apt mirrors incomplete/broken for s390x in the container
- Cross-compiler (
gcc-s390x-linux-gnu) — GCC 11 ICEs on random files under QEMU
Big-Endian CI Workflow for netCDF-C
Background and motivation
Most modern hardware — x86-64, ARM, RISC-V — is little-endian: multi-byte integers
are stored with the least-significant byte first. The netCDF-C library must also work
correctly on big-endian platforms, where the most-significant byte comes first. This
matters because:
netCDF file format correctness: The classic netCDF format stores data in
big-endian (XDR) byte order on disk regardless of the host architecture. The
conversion routines in
libsrc/ncx.m4(and the generatedncx.c) handle thistranslation. Bugs in those routines may only manifest on big-endian hosts where
no byte-swapping is needed and different code paths are taken.
Real-world deployments: IBM z/Architecture (s390x) mainframes running Linux
are big-endian and are used in production scientific and enterprise environments.
netCDF-C should build and pass its test suite on these systems.
Latent bugs: Several bugs in netCDF-C are only exposed on big-endian platforms
because the little-endian code path is exercised far more often. Without a CI job
that runs on a big-endian host, these bugs go undetected until a user reports them.
One such bug was found during the development of this workflow: an undeclared
fillpvariable in thencx_getn_float_floatandncx_getn_double_doublefallback paths in
libsrc/ncx.m4(see bugs section below).Since no big-endian hardware is available in GitHub Actions, we use QEMU software
emulation to run an s390x (IBM Z) environment on a standard x86-64 runner.
Current state
Workflow file:
.github/workflows/run_tests_bigendian.ymlFinal design
Trigger
workflow_dispatch(manual)pull_requesttargetingmainArchitecture
s390x(IBM z/Architecture, big-endian) viauraimo/run-on-arch-action@v2distro: ubuntu22.04inside the QEMU containerppc64andubuntu24.04are not supported by the actionHDF5
libhdf5-dev,libhdf5-103-1) — not built from source/usr/lib/s390x-linux-gnu/hdf5/serial/usr/include/hdf5/serialLIBS="-lhdf5_serial -lz"explicitly — the zlib check fails otherwiseCompiler
gcc/g++installed inside the QEMU container (not cross-compilers)--host=s390x-linux-gnu— therun:block executes natively inside the emulated s390x container, so no cross-compilation is neededgcc-s390x-linux-gnu) caused GCC 11 ICEs under QEMU; native gcc is stableSwap space
ddConfigure flags
Make
-j 1throughout (parallel builds triggered GCC ICEs under QEMU)Bugs found and fixed during development
libsrc/ncx.m4—fillpundeclared (fixed in master)ncx_getn_float_floatandncx_getn_double_double#elsefallback pathscalled
ncx_get_float_float(xp, tp, fillp)andncx_get_double_double(xp, tp, fillp)but
fillpwas not a parameter of the enclosinggetnfunction#elsebranch is compiled (non-IEEE-754 or cross-compilation)fillpwithNULLin both calls/home/ed/ncx_m4_fillp_bug.mdfor full issue writeupArchitectures tried and rejected
ppc64— not supported byuraimo/run-on-arch-action@v2ubuntu24.04— not supported byuraimo/run-on-arch-action@v2bullseye(Debian) — apt mirrors incomplete/broken for s390x in the containergcc-s390x-linux-gnu) — GCC 11 ICEs on random files under QEMU