I suggest we fully support the ZStandard library in the netCDF C, Fortran, and Java libraries. (This suggestion is part of a larger effort to improve netCDF compression #1545).
Compression is Essential, and Zlib is Slow
Compression has become a core component of netCDF. Large data producers, like NOAA, NASA, and ESA, all must compress their data in order to fit into their storage systems, which were sized with the assumption that data compression would be used operationally. That is, none of these data providers have the disk space to process and store these data uncompressed. Compression is essential.
The slowness of zlib with today's giant data files has become a real problem for ESA with the Copernicus satellite program, which has been generating large amounts of netCDF-4 data for many years. As with NASA Earth observing missions, the amazing increase in the capabilities of space instruments has been accompanied with a dramatic increase in the size of the data - the slow speed of zlib is becoming a serious problem.
As documented in #1545 NOAA is also running into the limits of zlib. For the recent GFS release, we dodged the bullet by adding the ability to use zlib with parallel I/O to netcdf-c-4.7.4. This allowed NOAA to regain its lost time budget and write the forecast data on time. But future versions of the GFS are planned with higher resolution. As we all understand, doubling the resolution in time and space results in a 16-fold multiplication of the data sizes, and, consequently, the time required to compress the data.
Why ZStandard?
The Zstandard compression library, currently available to netCDF users through the CCR project, provides significant performance improvements over zlib.
This chart from our recent AGU extended abstract shows typical performance improvement achieved with ZStandard:

When used with the new quantize feature, Zstandard performance is even more impressive:

Zstandard also offers a much wider range of tradeoff between compression and speed. For example in Klower's paper a setting of -10 was used to get very fast compression, which only compressed the data 50%. This allows operational requirements to be met. Zlib does not offer this range of speed/compression trade off. It is not possible to make zlib go much faster, whatever setting is used.
ZStandard is free and open software, readily available on all platforms: Unix, MacOS, and Windows. A reference C library is available which can be compiled anywhere, and pre-build Zstandard libraries are available on most package management systems. A pure-Java implementation is also available from the Zstandard website.
The Zstandard web site displays a table of Zstandard performance compared to other compression libraries, including zlib. They show better performance than I have measured but similar. They show a 5x speedup for writing, 4x for reading. (I have not measured read time - NOAA is more concerned with write times.)
Implementation
How would this look in implementation?
C and Fortran
The C and Fortran implementations are available in the CCR. They are very similar to the existing functions which currently support zlib:
int nc_def_var_zstandard(int ncid, int varid, int level);
int nc_inq_var_zstandard(int ncid, int varid, int *zstandardp, int *levelp);
In Fortran:
function nf90_def_var_zstandard(ncid, varid, level) result(status)
function nf90_inq_var_zstandard(ncid, varid, zstandardp, levelp) result(status)
The build systems would be updated to optionally include Zstandard, as Szip is handled now. Extra items in the build summary and netcdf_meta.h will help end users know if Zstandard is supported in any particular build. The Zstandard filter code would be included with netcdf-c, and installed in the HDF5 filter directory for the user.
(If released in the netCDF C and Fortran libraries, these functions would be removed from CCR.)
NetCDF-Java
There is a pure-Java Zstandard library. Code to use it as a filter would have to be written by the Unidata netCDF-Java team.
My understanding is that only Java read code would have to be modified. The pure Java part of netCDF-Java only reads NetCDF/HDF5 files. In this case the use of Zstandard would be similar to the way zlib is handled.
The NetCDF-Java library would be able to use the C library to write netCDF/HDF5 files with Zstandard compression, but support would have to be added to call the new functions.
Conclusion
The inclusion of Zstandard in all netCDF libraries will be a significant improvement for large data producers such as NOAA, NASA, and ESA. Zstandard provides both better performance and a much better range of performance, allowing data producers to tune the compression vs. speed of their data to match sometimes stringent operational requirements.
All readers of compressed data will also benefit from a performance speedup with Zstandard.
The greater ease of handling larger and larger data sets will also bring benefit to the science community, as more data can be written and distributed within the operational constraints of large data producers.
Although these benefits are currently available if all concerned install CCR, I believe there will be significant benefit to the netCDF community if Zstandard were supported out of the box by all netCDF libraries.
@DennisHeimbigner @WardF @dopplershift @lesserwhirls @czender and @gsjaardema your input is very much requested...
References
CCR web site: https://github.com/ccr/ccr
Zstandard web site: http://facebook.github.io/zstd/
Compressing atmospheric data into its real information content
Milan Klöwer, Miha Razinger, Juan J. Dominguez, Peter D. Düben & Tim N. Palmer, https://www.nature.com/articles/s43588-021-00156-2
Quantization and Next-Generation Zlib Compression for Fully Backward-Compatible, Faster, and More Effective Data Compression in NetCDF Files, Hartnett, Zender, Fisher, Heimbigner, Hang, Gerheiser, Curtis: https://www.researchgate.net/publication/357001251_Quantization_and_Next-Generation_Zlib_Compression_for_Fully_Backward-Compatible_Faster_and_More_Effective_Data_Compression_in_NetCDF_Files
I suggest we fully support the ZStandard library in the netCDF C, Fortran, and Java libraries. (This suggestion is part of a larger effort to improve netCDF compression #1545).
Compression is Essential, and Zlib is Slow
Compression has become a core component of netCDF. Large data producers, like NOAA, NASA, and ESA, all must compress their data in order to fit into their storage systems, which were sized with the assumption that data compression would be used operationally. That is, none of these data providers have the disk space to process and store these data uncompressed. Compression is essential.
The slowness of zlib with today's giant data files has become a real problem for ESA with the Copernicus satellite program, which has been generating large amounts of netCDF-4 data for many years. As with NASA Earth observing missions, the amazing increase in the capabilities of space instruments has been accompanied with a dramatic increase in the size of the data - the slow speed of zlib is becoming a serious problem.
As documented in #1545 NOAA is also running into the limits of zlib. For the recent GFS release, we dodged the bullet by adding the ability to use zlib with parallel I/O to netcdf-c-4.7.4. This allowed NOAA to regain its lost time budget and write the forecast data on time. But future versions of the GFS are planned with higher resolution. As we all understand, doubling the resolution in time and space results in a 16-fold multiplication of the data sizes, and, consequently, the time required to compress the data.
Why ZStandard?
The Zstandard compression library, currently available to netCDF users through the CCR project, provides significant performance improvements over zlib.
This chart from our recent AGU extended abstract shows typical performance improvement achieved with ZStandard:
When used with the new quantize feature, Zstandard performance is even more impressive:

Zstandard also offers a much wider range of tradeoff between compression and speed. For example in Klower's paper a setting of -10 was used to get very fast compression, which only compressed the data 50%. This allows operational requirements to be met. Zlib does not offer this range of speed/compression trade off. It is not possible to make zlib go much faster, whatever setting is used.
ZStandard is free and open software, readily available on all platforms: Unix, MacOS, and Windows. A reference C library is available which can be compiled anywhere, and pre-build Zstandard libraries are available on most package management systems. A pure-Java implementation is also available from the Zstandard website.
The Zstandard web site displays a table of Zstandard performance compared to other compression libraries, including zlib. They show better performance than I have measured but similar. They show a 5x speedup for writing, 4x for reading. (I have not measured read time - NOAA is more concerned with write times.)
Implementation
How would this look in implementation?
C and Fortran
The C and Fortran implementations are available in the CCR. They are very similar to the existing functions which currently support zlib:
In Fortran:
The build systems would be updated to optionally include Zstandard, as Szip is handled now. Extra items in the build summary and netcdf_meta.h will help end users know if Zstandard is supported in any particular build. The Zstandard filter code would be included with netcdf-c, and installed in the HDF5 filter directory for the user.
(If released in the netCDF C and Fortran libraries, these functions would be removed from CCR.)
NetCDF-Java
There is a pure-Java Zstandard library. Code to use it as a filter would have to be written by the Unidata netCDF-Java team.
My understanding is that only Java read code would have to be modified. The pure Java part of netCDF-Java only reads NetCDF/HDF5 files. In this case the use of Zstandard would be similar to the way zlib is handled.
The NetCDF-Java library would be able to use the C library to write netCDF/HDF5 files with Zstandard compression, but support would have to be added to call the new functions.
Conclusion
The inclusion of Zstandard in all netCDF libraries will be a significant improvement for large data producers such as NOAA, NASA, and ESA. Zstandard provides both better performance and a much better range of performance, allowing data producers to tune the compression vs. speed of their data to match sometimes stringent operational requirements.
All readers of compressed data will also benefit from a performance speedup with Zstandard.
The greater ease of handling larger and larger data sets will also bring benefit to the science community, as more data can be written and distributed within the operational constraints of large data producers.
Although these benefits are currently available if all concerned install CCR, I believe there will be significant benefit to the netCDF community if Zstandard were supported out of the box by all netCDF libraries.
@DennisHeimbigner @WardF @dopplershift @lesserwhirls @czender and @gsjaardema your input is very much requested...
References
CCR web site: https://github.com/ccr/ccr
Zstandard web site: http://facebook.github.io/zstd/
Compressing atmospheric data into its real information content
Milan Klöwer, Miha Razinger, Juan J. Dominguez, Peter D. Düben & Tim N. Palmer, https://www.nature.com/articles/s43588-021-00156-2
Quantization and Next-Generation Zlib Compression for Fully Backward-Compatible, Faster, and More Effective Data Compression in NetCDF Files, Hartnett, Zender, Fisher, Heimbigner, Hang, Gerheiser, Curtis: https://www.researchgate.net/publication/357001251_Quantization_and_Next-Generation_Zlib_Compression_for_Fully_Backward-Compatible_Faster_and_More_Effective_Data_Compression_in_NetCDF_Files