Skip to content

Commit fb40a72

Browse files
Improve performance of the nc_reclaim_data and nc_copy_data functions.
re: Issue Unidata#2685 re: PR Unidata#2179 As noted in PR Unidata#2179, the old code did not allow for reclaiming instances of types, nor for properly copying them. That PR provided new functions capable of reclaiming/copying instances of arbitrary types. However, as noted by Issue Unidata#2685, using these most general functions resulted in a significant performance degradation, even for common cases. This PR attempts to mitigate the cost of using the general reclaim/copy functions in two ways. First, the previous functions operating at the top level by using ncid and typeid arguments. These functions were augmented with equivalent versions that used the netcdf-c library internal data structures to allow direct access to needed information. These new functions are used internally to the library. The second mitigation involves optimizing the internal functions by providing early tests for common cases. This avoids unnecessary recursive function calls. The overall result is a significant improvement in speed by a factor of roughly twenty -- your mileage may vary. These optimized functions are still not as fast as the original (more limited) functions, but they are getting close. Additional optimizations are possible. But the cost is a significant "uglification" of the code that I deemed a step too far, at least for now. ## Misc. Changes 1. Added a test case to check the proper reclamation/copy of complex types. 2. Found and fixed some places where nc_reclaim/copy should have been used. 3. Replaced, in the netcdf-c library, (almost all) occurrences of nc_reclaim_copy with calls to NC_reclaim/copy. This plus the optimizations is the primary speed-up mechanism. 4. In DAP4, the metadata is held in a substrate in-memory file; this required some changes so that the reclaim/copy code accessed that substrate dispatcher rather than the DAP4 dispatcher. 5. Re-factored and isolated the code that computes if a type is (transitively) variable-sized or not. 6. Clean up the reclamation code in ncgen; adding the use of nc_reclaim exposed some memory problems.
1 parent bfb8a31 commit fb40a72

49 files changed

Lines changed: 1702 additions & 815 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

RELEASE_NOTES.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ This file contains a high-level description of this package's evolution. Release
88
## 4.9.3 - TBD
99

1010

11+
* Improve performance and testing of the new nc_reclaim/copy functions. See [Github #????](https://github.com/Unidata/netcdf-c/pull/????).
1112
* Improve S3 documentation, testing, and support See [Github #2686](https://github.com/Unidata/netcdf-c/pull/2686).
1213
* Remove obsolete code. See [Github #2680](https://github.com/Unidata/netcdf-c/pull/2680).
1314
* [Bug Fix] Add a crude test to see if an NCZarr path looks like a valid NCZarr/Zarr file. See [Github #2658](https://github.com/Unidata/netcdf-c/pull/2658).

configure.ac

Lines changed: 14 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1488,10 +1488,14 @@ AC_CHECK_SIZEOF(ssize_t)
14881488
$SLEEPCMD
14891489
AC_CHECK_SIZEOF([void*])
14901490

1491-
if test "x$enable_hdf5" = xyes || test "x$enable_dap" = xyes ; then
1492-
AC_SEARCH_LIBS([deflate], [zlibwapi zlibstat zlib zlib1 z], [], [
1493-
AC_MSG_ERROR([Can't find or link to the z library. Turn off netCDF-4 and \
1494-
DAP clients with --disable-hdf5 --disable-dap, or see config.log for errors.])])
1491+
# Check for deflate library
1492+
AC_SEARCH_LIBS([deflate], [zlibwapi zlibstat zlib zlib1 z], [have_deflate=yes], [have_deflate=no])
1493+
AC_MSG_CHECKING([whether deflate library is available])
1494+
AC_MSG_RESULT([${have_deflate}])
1495+
1496+
if test "x$have_deflate" = xno ; then
1497+
AC_MSG_ERROR([Can't find or link to the z library. Turn off netCDF-4 and \
1498+
DAP and NCZarr clients with --disable-hdf5 --disable-dap --disable-nczarr, or see config.log for errors.])
14951499
fi
14961500

14971501
# We need the math library
@@ -1857,6 +1861,7 @@ AM_CONDITIONAL(ENABLE_S3_TESTPUB, [test "x$with_s3_testing" != xno]) # all => pu
18571861
AM_CONDITIONAL(ENABLE_S3_TESTALL, [test "x$with_s3_testing" = xyes])
18581862
AM_CONDITIONAL(ENABLE_NCZARR_ZIP, [test "x$enable_nczarr_zip" = xyes])
18591863
AM_CONDITIONAL(HAS_MULTIFILTERS, [test "x$has_multifilters" = xyes])
1864+
AM_CONDITIONAL(HAVE_DEFLATE, [test "x$have_deflate" = xyes])
18601865
AM_CONDITIONAL(HAVE_SZ, [test "x$have_sz" = xyes])
18611866
AM_CONDITIONAL(HAVE_H5Z_SZIP, [test "x$enable_hdf5_szip" = xyes])
18621867
AM_CONDITIONAL(HAVE_BLOSC, [test "x$have_blosc" = xyes])
@@ -1971,6 +1976,7 @@ AC_SUBST(DO_NCZARR_ZIP_TESTS,[$enable_nczarr_zip])
19711976
AC_SUBST(HAS_QUANTIZE,[$enable_quantize])
19721977
AC_SUBST(HAS_LOGGING,[$enable_logging])
19731978
AC_SUBST(DO_FILTER_TESTS,[$enable_filter_testing])
1979+
AC_SUBST(HAS_DEFLATE,[$have_deflate])
19741980
AC_SUBST(HAS_SZLIB,[$have_sz])
19751981
AC_SUBST(HAS_SZLIB_WRITE, [$have_sz])
19761982
AC_SUBST(HAS_ZSTD,[$have_zstd])
@@ -1987,8 +1993,10 @@ AC_SUBST(WHICH_S3_SDK,[none])
19871993
fi
19881994

19891995
# Always available
1990-
std_filters="deflate bz2"
1991-
1996+
std_filters="bz2"
1997+
if test "x$have_deflate" = xyes ; then
1998+
std_filters="${std_filters} deflate"
1999+
fi
19922000
if test "x$have_sz" = xyes ; then
19932001
std_filters="${std_filters} szip"
19942002
fi

docs/cloud.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -258,7 +258,7 @@ This is an experimental SDK provided internally in the netcdf-c library.
258258

259259
* It is written in C
260260
* It provides the minimum functionality necessary to read/write/search an Amazon S3 bucket.
261-
* It was constructed by heavily modifying the HDF5 *H5FDros3* Virtual File Driver and combining it with crypto code wrappers provided by libcurl. The resulting file was then modified to fit into the netcdf coding style.
261+
* It was constructed by heavily modifying the HDF5 *H5Fs3commons.c* file and combining it with crypto code wrappers provided by libcurl. The resulting file was then modified to fit into the netcdf coding style.
262262
* The resulting code is rather ugly, but appears to work under at least Linux and under Windows (using Visual C++).
263263

264264
### Dependencies

docs/internal.md

Lines changed: 72 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,14 @@ Notes On the Internals of the NetCDF-C Library
44

55
# Notes On the Internals of the NetCDF-C Library {#intern_head}
66

7-
\tableofcontents
8-
97
This document attempts to record important information about
108
the internal architecture and operation of the netcdf-c library.
9+
It covers the following issues.
10+
11+
* [Including C++ Code in the netcdf-c Library](#intern_c++)
12+
* [Managing instances of variable-length data types](#intern_vlens)
13+
* [Inferring File Types](#intern_infer)
14+
* [Adding a Standard Filter](#intern_filters)
1115

1216
# 1. Including C++ Code in the netcdf-c Library {#intern_c++}
1317

@@ -41,46 +45,71 @@ AM_CXXFLAGS = -std=c++11
4145
````
4246
It is possible that other values (e.g. *-std=c++14*) may also work.
4347

44-
# 2. Managing instances of complex data types
48+
# 2. Managing instances of variable-length data types {#intern_vlens}
4549

4650
For a long time, there have been known problems with the
4751
management of complex types containing VLENs. This also
48-
involves the string type because it is stored as a VLEN of
49-
chars.
52+
involves the string type because it is stored as a VLEN of chars.
5053

51-
The term complex type refers to any type that directly or
52-
recursively references a VLEN type. So an array of VLENS, a
53-
compound with a VLEN field, and so on.
54+
The term "variable-length" refers to any
55+
type that directly or recursively references a VLEN type. So an
56+
array of VLENS, a compound with a VLEN field, and so on.
5457

55-
In order to properly handle instances of these complex types, it
58+
In order to properly handle instances of these variable-length types, it
5659
is necessary to have function that can recursively walk
5760
instances of such types to perform various actions on them. The
5861
term "deep" is also used to mean recursive.
5962

60-
Two deep walking operations are provided by the netcdf-c library
61-
to aid in managing instances of complex structures.
62-
- free'ing an instance of the complex type
63-
- copying an instance of the complex type.
63+
Two primary deep walking operations are provided by the netcdf-c library
64+
to aid in managing instances of variable-length types.
65+
- free'ing an instance of the type
66+
- copying an instance of the type.
67+
68+
Note that the term "vector" is used in the following text to
69+
mean a contiguous (in memory) sequence of instances of some
70+
type. Given an array with, say, dimensions 2 X 3 X 4, this will
71+
be stored in memory as a vector of length 2\*3\*4=24 instances.
72+
73+
## Special case reclamation functions
74+
75+
Previously The netcdf-c library provided functions to reclaim an
76+
instance of a subset of the possible variable-length types. It
77+
provided no specialized support for copying instances of
78+
variable-length types.
79+
80+
These specialized functions are still available and can be used
81+
when their pre-conditions are met. They have the advantage of
82+
being faster than the general purpose functions described
83+
later in this document.
6484

65-
Previously The netcdf-c library only did shallow free and shallow copy of
66-
complex types. This meant that only the top level was properly
67-
free'd or copied, but deep internal blocks in the instance were
68-
not touched. This led to a host of memory leaks and failures
69-
when the deep data was effectively shared between the netcdf-c library
70-
internally and the user's data.
85+
These functions, and their pre and post conditions, are as follows.
86+
* *int nc_free_string(size_t len, char\* data[])*
87+
<u>Action</u>: reclaim a vector of variable length strings.
88+
<u>Pre-condition(s)</u>: the data argument is a vector containing *len* pointers to variable length, nul-terminated strings.
89+
<u>Post-condition(s)</u>: only the strings in the vector are reclaimed -- the vector itself is not reclaimed.
7190

72-
Note that the term "vector" is used to mean a contiguous (in
73-
memory) sequence of instances of some type. Given an array with,
74-
say, dimensions 2 X 3 X 4, this will be stored in memory as a
75-
vector of length 2\*3\*4=24 instances.
91+
* int nc_free_vlens(size_t len, nc_vlen_t vlens[])
92+
<u>Action</u>: reclaim a vector of VLEN instances.
93+
<u>Pre-condition(s)</u>: the data argument is a vector containing *len* pointers to variable length, nul-terminated strings.
94+
<u>Post-condition(s)</u>:
95+
* only the data pointed to by the VLEN instances (i.e. *nc_vlen_t.p* in the vector are reclaimed -- the vector itself is not reclaimed.
96+
* the base type of the VLEN must be a fixed-size type -- this means atomic types in the range NC_CHAR thru NC_UINT64, and compound types where the basetype of each field is itself fixed-size.
7697

77-
The use cases are primarily these.
98+
* *int nc_free_vlen(nc_vlen_t \*vl)*
99+
<u>Action</u>: this is equivalent to calling *nc_free_vlens(1,&vl)*.
78100

79-
## nc\_get\_vars
101+
If the pre and post conditions are not met, then using these
102+
functions can lead to a host of memory leaks and failures
103+
because the deep data variable-length data is in effect
104+
shared between the netcdf-c library internally and the user's data.
105+
106+
## Typical use cases
107+
108+
### *nc\_get\_vars*
80109
Suppose one is reading a vector of instances using nc\_get\_vars
81110
(or nc\_get\_vara or nc\_get\_var, etc.). These functions will
82111
return the vector in the top-level memory provided. All
83-
interior blocks (form nested VLEN or strings) will have been
112+
interior blocks (from nested VLEN or strings) will have been
84113
dynamically allocated. Note that computing the size of the vector
85114
may be tricky because the strides must be taken into account.
86115

@@ -90,14 +119,7 @@ memory leak occurs. So, the recursive reclaim function is used
90119
to walk the returned instance vector and do a deep reclaim of
91120
the data.
92121

93-
Currently functions are defined in netcdf.h that are supposed to
94-
handle this: nc\_free\_vlen(), nc\_free\_vlens(), and
95-
nc\_free\_string(). Unfortunately, these functions only do a
96-
shallow free, so deeply nested instances are not properly
97-
handled by them. They are marked in the description as
98-
deprecated in favor of the newer recursive function.
99-
100-
## nc\_put\_vars
122+
### *nc\_put\_vars*
101123

102124
Suppose one is writing a vector of instances using nc\_put\_vars
103125
(or nc\_put\_vara or nc\_put\_var, etc.). These functions will
@@ -113,7 +135,10 @@ So again, the recursive reclaim function can be used
113135
to walk the returned instance vector and do a deep reclaim of
114136
the data.
115137

116-
## nc\_put\_att
138+
WARNING: If the data passed into these functions contains statically
139+
allocated data, then using any of the reclaim functions will fail.
140+
141+
### *nc\_put\_att*
117142
Suppose one is writing a vector of instances as the data of an attribute
118143
using, say, nc\_put\_att.
119144

@@ -122,26 +147,18 @@ so that changes/reclamation of the input data will not affect
122147
the attribute. Note that this copying behavior is different from
123148
writing to a variable, where the data is written immediately.
124149

125-
Again, the code inside the netcdf library used to use only shallow copying
126-
rather than deep copy. As a result, one saw effects such as described
127-
in Github Issue https://github.com/Unidata/netcdf-c/issues/2143.
128-
129-
Also, after defining the attribute, it may be necessary for the user
150+
After defining the attribute, it may be necessary for the user
130151
to free the data that was provided as input to nc\_put\_att() as in the
131152
nc\_put\_xxx functions (previously described).
132153

133-
## nc\_get\_att
154+
### *nc\_get\_att*
134155
Suppose one is reading a vector of instances as the data of an attribute
135156
using, say, nc\_get\_att.
136157

137158
Internally, the existing attribute data must be copied and returned
138159
to the caller, and the caller is responsible for reclaiming
139160
the returned data.
140161

141-
Again, the code inside the netcdf library used to only do shallow copying
142-
rather than deep copy. So this could lead to memory leaks and errors
143-
because the deep data was shared between the library and the user.
144-
145162
## New Instance Walking API
146163

147164
Proper recursive functions were added to the netcdf-c library to
@@ -156,27 +173,25 @@ int nc_reclaim_data_all(int ncid, nc_type xtypeid, void* memory, size_t count);
156173
int nc_copy_data(int ncid, nc_type xtypeid, const void* memory, size_t count, void* copy);
157174
int nc_copy_data_all(int ncid, nc_type xtypeid, const void* memory, size_t count, void** copyp);
158175
````
159-
There are two variants. The first two, nc\_reclaim\_data() and
176+
There are two variants. The two functions, nc\_reclaim\_data() and
160177
nc\_copy\_data(), assume the top-level vector is managed by the
161178
caller. For reclaim, this is so the user can use, for example, a
162179
statically allocated vector. For copy, it assumes the user
163180
provides the space into which the copy is stored.
164181

165-
The second two, nc\_reclaim\_data\_all() and
166-
nc\_copy\_data\_all(), allows the functions to manage the
182+
The other two, nc\_reclaim\_data\_all() and
183+
nc\_copy\_data\_all(), allow the called functions to manage the
167184
top-level. So for nc\_reclaim\_data\_all, the top level is
168185
assumed to be dynamically allocated and will be free'd by
169186
nc\_reclaim\_data\_all(). The nc\_copy\_data\_all() function
170187
will allocate the top level and return a pointer to it to the
171188
user. The user can later pass that pointer to
172189
nc\_reclaim\_data\_all() to reclaim the instance(s).
173190

174-
# Internal Changes
191+
## Internal Changes
175192
The netcdf-c library internals are changed to use the proper reclaim
176193
and copy functions. This also allows some simplification of the code
177194
since the stdata and vldata fields of NC\_ATT\_INFO are no longer needed.
178-
Currently this is commented out using the SEPDATA \#define macro.
179-
When the bugs are found and fixed, all this code will be removed.
180195

181196
## Optimizations
182197

@@ -202,7 +217,7 @@ The classification of types can be made at the time the type is defined
202217
or is read in from some existing file. The reclaim and copy functions
203218
use this information to speed up the handling of fixed size types.
204219

205-
# Warnings
220+
## Warnings
206221

207222
1. The new API functions require that the type information be
208223
accessible. This means that you cannot use these functions
@@ -228,7 +243,7 @@ use this information to speed up the handling of fixed size types.
228243
}
229244
````
230245

231-
# 3. Inferring File Types
246+
# 3. Inferring File Types {#intern_infer}
232247

233248
As described in the companion document -- docs/dispatch.md --
234249
when nc\_create() or nc\_open() is called, it must figure out what
@@ -478,7 +493,7 @@ So for example, if DAP2 is the model, then all netcdf-4 mode flags
478493
and some netcdf-3 flags are removed from the set of mode flags
479494
because DAP2 provides only a standard netcdf-classic format.
480495

481-
# 4. Adding a Standard Filter
496+
# 4. Adding a Standard Filter {#intern_filters}
482497

483498
The standard filter system extends the netcdf-c library API to
484499
support a fixed set of "standard" filters. This is similar to the
@@ -534,7 +549,7 @@ and CMake (CMakeLists.txt)
534549
Configure.ac must have a block that similar to this that locates
535550
the implementing library.
536551
````
537-
# See if we have libzstd
552+
\# See if we have libzstd
538553
AC_CHECK_LIB([zstd],[ZSTD_compress],[have_zstd=yes],[have_zstd=no])
539554
if test "x$have_zstd" = "xyes" ; then
540555
AC_SEARCH_LIBS([ZSTD_compress],[zstd zstd.dll cygzstd.dll], [], [])
@@ -559,7 +574,7 @@ endif
559574
````
560575

561576
````
562-
# Need our version of szip if libsz available and we are not using HDF5
577+
\# Need our version of szip if libsz available and we are not using HDF5
563578
if HAVE_SZ
564579
noinst_LTLIBRARIES += libh5szip.la
565580
libh5szip_la_SOURCES = H5Zszip.c H5Zszip.h
@@ -637,4 +652,4 @@ done:
637652
*Author*: Dennis Heimbigner<br>
638653
*Email*: dmh at ucar dot edu<br>
639654
*Initial Version*: 12/22/2021<br>
640-
*Last Revised*: 01/25/2022
655+
*Last Revised*: 5/16/2023

include/nc.h

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -77,4 +77,64 @@ extern int iterate_NCList(int i,NC**); /* Walk from 0 ...; ERANGE return => stop
7777
extern void free_NC(NC*);
7878
extern int new_NC(const struct NC_Dispatch*, const char*, int, NC**);
7979

80+
/* Defined in dinstance_intern.c */
81+
82+
/**************************************************/
83+
/**
84+
Following are the internal implementation signatures upon which
85+
are built the API functions: nc_reclaim_data, nc_reclaim_data_all,
86+
nc_copy_data, and nc_copy_data_all.
87+
These functions access internal data structures instead of using e.g. ncid.
88+
This is to improve performance.
89+
*/
90+
91+
/*
92+
Reclaim a vector of instances of arbitrary type.
93+
This recursively walks the top-level instances to reclaim any
94+
nested data such as vlen or strings or such.
95+
96+
Assumes it is passed a pointer to count instances of xtype.
97+
Reclaims any nested data.
98+
99+
WARNING: nc_reclaim_data does not reclaim the top-level
100+
memory because we do not know how it was allocated. However
101+
nc_reclaim_data_all does reclaim top-level memory assuming that it
102+
was allocated using malloc().
103+
104+
WARNING: all data blocks below the top-level (e.g. string
105+
instances) will be reclaimed, so do not call if there is any
106+
static data in the instance.
107+
108+
Should work for any netcdf format.
109+
*/
110+
111+
EXTERNL int NC_reclaim_data(NC* nc, nc_type xtypeid, void* memory, size_t count);
112+
EXTERNL int NC_reclaim_data_all(NC* nc, nc_type xtypeid, void* memory, size_t count);
113+
114+
/**
115+
Copy vector of arbitrary type instances. This recursively walks
116+
the top-level instances to copy any nested data such as vlen or
117+
strings or such.
118+
119+
Assumes it is passed a pointer to count instances of xtype.
120+
WARNING: nc_copy_data does not copy the top-level memory, but
121+
assumes a block of proper size was passed in. However
122+
nc_copy_data_all does allocate top-level memory copy.
123+
124+
Should work for any netcdf format.
125+
*/
126+
127+
EXTERNL int NC_copy_data(NC* nc, nc_type xtypeid, const void* memory, size_t count, void* copy);
128+
EXTERNL int NC_copy_data_all(NC* nc, nc_type xtypeid, const void* memory, size_t count, void** copyp);
129+
130+
/* Macros to map NC_FORMAT_XX to metadata structure (e.g. NC_FILE_INFO_T) */
131+
/* Fast test for what file structure is used */
132+
#define NC3INFOFLAGS ((1<<NC_FORMATX_NC3)|(1<<NC_FORMATX_PNETCDF)|(1<<NC_FORMATX_DAP2))
133+
#define FILEINFOFLAGS ((1<<NC_FORMATX_NC_HDF5)|(1<<NC_FORMATX_NC_HDF4)|(1<<NC_FORMATX_DAP4)|(1<<NC_FORMATX_UDF1)|(1<<NC_FORMATX_UDF0)|(1<<NC_FORMATX_NCZARR))
134+
135+
#define USENC3INFO(nc) ((1<<(nc->dispatch->model)) & NC3INFOFLAGS)
136+
#define USEFILEINFO(nc) ((1<<(nc->dispatch->model)) & FILEINFOFLAGS)
137+
#define USED2INFO(nc) ((1<<(nc->dispatch->model)) & (1<<NC_FORMATX_DAP2))
138+
#define USED4INFO(nc) ((1<<(nc->dispatch->model)) & (1<<NC_FORMATX_DAP4))
139+
80140
#endif /* _NC_H_ */

0 commit comments

Comments
 (0)