Allow parallel I/O writes to use zlib

Allow users to use zlib on parallel writes. This feature is needed urgently by NOAA. No doubt it is desired by many other large data producers.

Large data producers like NOAA are turning to netCDF-4 due to built-in compression. On HPC applications especially, they are writing very large files (10s of GB per file is normal, even with compression).

Since these files are produced on a HPC system, the data start on many processors. Due to the current limitation of the netcdf-c library, they must move all data to one processor, and have that one processor do all compression and writing sequentially. This slows down their very expensive machine!

More importantly, it means that compromises must be made in what is saved. Instead of saving every value that the science team wants, only a subset of values and data are saved. This makes the science harder but makes operations work.

Allowing users to write data compressed with parallel I/O will save them the step of moving all data to one processor, and will also allow the compression to be spread to many processors, instead of just one. This will improve their write time by almost an order of magnitude.

Decrease in I/O time leads to increased time for computation, allowing the science team to increase resolution or use more intensive algorithms, and still meet operational requirements. 

Thus improving the I/O times directly impacts the science.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow parallel I/O writes to use zlib #1580

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Allow parallel I/O writes to use zlib #1580

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions