We Need a Big-Endian CI Workflow for netCDF-C

# Big-Endian CI Workflow for netCDF-C

## Background and motivation

Most modern hardware — x86-64, ARM, RISC-V — is little-endian: multi-byte integers
are stored with the least-significant byte first. The netCDF-C library must also work
correctly on big-endian platforms, where the most-significant byte comes first. This
matters because:

1. **netCDF file format correctness**: The classic netCDF format stores data in
   big-endian (XDR) byte order on disk regardless of the host architecture. The
   conversion routines in `libsrc/ncx.m4` (and the generated `ncx.c`) handle this
   translation. Bugs in those routines may only manifest on big-endian hosts where
   no byte-swapping is needed and different code paths are taken.

2. **Real-world deployments**: IBM z/Architecture (s390x) mainframes running Linux
   are big-endian and are used in production scientific and enterprise environments.
   netCDF-C should build and pass its test suite on these systems.

3. **Latent bugs**: Several bugs in netCDF-C are only exposed on big-endian platforms
   because the little-endian code path is exercised far more often. Without a CI job
   that runs on a big-endian host, these bugs go undetected until a user reports them.
   One such bug was found during the development of this workflow: an undeclared
   `fillp` variable in the `ncx_getn_float_float` and `ncx_getn_double_double`
   fallback paths in `libsrc/ncx.m4` (see bugs section below).

Since no big-endian hardware is available in GitHub Actions, we use QEMU software
emulation to run an s390x (IBM Z) environment on a standard x86-64 runner.

## Current state

Workflow file: `.github/workflows/run_tests_bigendian.yml`

## Final design

### Trigger
- `workflow_dispatch` (manual)
- `pull_request` targeting `main`

### Architecture
- `s390x` (IBM z/Architecture, big-endian) via `uraimo/run-on-arch-action@v2`
- `distro: ubuntu22.04` inside the QEMU container
- `ppc64` and `ubuntu24.04` are **not supported** by the action

### HDF5
- Installed from apt (`libhdf5-dev`, `libhdf5-103-1`) — **not built from source**
- Building from source fails because HDF5 configure tries to run compiled test binaries, which doesn't work under QEMU emulation
- HDF5 serial library path: `/usr/lib/s390x-linux-gnu/hdf5/serial`
- Headers: `/usr/include/hdf5/serial`
- Must pass `LIBS="-lhdf5_serial -lz"` explicitly — the zlib check fails otherwise

### Compiler
- Native `gcc`/`g++` installed inside the QEMU container (not cross-compilers)
- Do NOT use `--host=s390x-linux-gnu` — the `run:` block executes natively inside the emulated s390x container, so no cross-compilation is needed
- Cross-compiler packages (`gcc-s390x-linux-gnu`) caused GCC 11 ICEs under QEMU; native gcc is stable

### Swap space
- 8GB swap added on the host runner before QEMU starts via `dd`
- Required because QEMU emulation is memory-intensive and GCC crashes (ICE/segfault) without sufficient memory

### Configure flags
```
--enable-hdf5
--disable-dap
--disable-dap-remote-tests
--disable-nczarr
--disable-libxml2
--disable-shared
--enable-static
--disable-utilities      # skips ncgen/ncdump/ncrandom which triggered GCC ICEs
```

### Make
- `-j 1` throughout (parallel builds triggered GCC ICEs under QEMU)

## Bugs found and fixed during development

### `libsrc/ncx.m4` — `fillp` undeclared (fixed in master)
- `ncx_getn_float_float` and `ncx_getn_double_double` `#else` fallback paths
  called `ncx_get_float_float(xp, tp, fillp)` and `ncx_get_double_double(xp, tp, fillp)`
  but `fillp` was not a parameter of the enclosing `getn` function
- Only exposed when the `#else` branch is compiled (non-IEEE-754 or cross-compilation)
- Fixed by replacing `fillp` with `NULL` in both calls
- See `/home/ed/ncx_m4_fillp_bug.md` for full issue writeup

## Architectures tried and rejected
- `ppc64` — not supported by `uraimo/run-on-arch-action@v2`
- `ubuntu24.04` — not supported by `uraimo/run-on-arch-action@v2`
- `bullseye` (Debian) — apt mirrors incomplete/broken for s390x in the container
- Cross-compiler (`gcc-s390x-linux-gnu`) — GCC 11 ICEs on random files under QEMU


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

We Need a Big-Endian CI Workflow for netCDF-C #3282

Big-Endian CI Workflow for netCDF-C

Background and motivation

Current state

Final design

Trigger

Architecture

HDF5

Compiler

Swap space

Configure flags

Make

Bugs found and fixed during development

`libsrc/ncx.m4` — `fillp` undeclared (fixed in master)

Architectures tried and rejected

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

We Need a Big-Endian CI Workflow for netCDF-C #3282

Description

Big-Endian CI Workflow for netCDF-C

Background and motivation

Current state

Final design

Trigger

Architecture

HDF5

Compiler

Swap space

Configure flags

Make

Bugs found and fixed during development

libsrc/ncx.m4 — fillp undeclared (fixed in master)

Architectures tried and rejected

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`libsrc/ncx.m4` — `fillp` undeclared (fixed in master)