Skip to content

HDF error on reading back NC_VLEN variable with fill value and chunking #2212

@krisfed

Description

@krisfed

We are using netcdf-c 4.8.1 and seeing an HDF error when using nc_get_var on an NC_VLEN variable that has (1) a fill value set, (2) not all elements filled, and (3) some chunking applied.

Not sure if this is expected behavior and the applied chunking or some other part of the process is incorrect (but then shouldn't it error out on writing, not reading?). Or does this look like a bug?

Here is some simplistic reproduction code. Here I have an NC_VLEN (of NC_DOUBLEs) variable with one dimension of size 4, and I am only writing the first 2 elements of it. There is fill value (set as {0, 101}) and chunking (set to 1).

#include <iostream>
#include "netcdf.h"

void checkErrorCode(int status, const char* message){
    if (status != NC_NOERR){
        std::cout << "Error code: " << status << " from " << message << std::endl;
        std::cout << nc_strerror(status) << std::endl << std::endl;
    }
}

int main(int argc, const char * argv[]) {
    
    // ================ WRITE ==================
    
    // Setup data
    const size_t DATA_LENGTH = 4;
    nc_vlen_t data[DATA_LENGTH];
    
    const int first_size = 2;
    double first[first_size] = {2, 5};
    data[0].p = first;
    data[0].len = first_size;
    
    const int second_size = 3;
    double second[second_size] = {88, 96, 42};
    data[1].p = second;
    data[1].len = second_size;

    // Open file
    int ncid;
    int retval;
    
    retval = nc_create("vlenFillValue.nc", NC_NETCDF4, &ncid);
    checkErrorCode(retval, "nc_create");
    
    // Define vlen type named RAGGED_DOUBLE
    nc_type vlen_typeID;
    retval = nc_def_vlen(ncid, "RAGGED_DOUBLE", NC_DOUBLE, &vlen_typeID);
    checkErrorCode(retval, "nc_def_vlen");
    
    // Define dimension
    int dimid;
    retval = nc_def_dim(ncid, "xdim", DATA_LENGTH, &dimid);
    checkErrorCode(retval, "nc_def_dim");
    
    // Define vlen variable
    int varid;
    retval = nc_def_var(ncid, "var", vlen_typeID, 1, &dimid, &varid);
    checkErrorCode(retval, "nc_def_var");
    
    // Define chunking
    const size_t chunk = 1; //error also with 3
    retval = nc_def_var_chunking(ncid, varid, NC_CHUNKED, &chunk);
    checkErrorCode(retval, "nc_def_var_chunking");
    
    // Define fill value
    nc_vlen_t fillValue;
    double fv[2] = {0, 101};
    fillValue.p = fv;
    fillValue.len = 2;
    retval = nc_def_var_fill(ncid, varid, NC_FILL, &fillValue);
    checkErrorCode(retval, "nc_def_var_fill");
    
    // Write vlen variable
    size_t start = 0;
    size_t count = 2;
    retval = nc_put_vara(ncid, varid, &start, &count, data);
    checkErrorCode(retval, "nc_put_vara");
    
    retval = nc_close(ncid);
    checkErrorCode(retval, "nc_close (1)");
    
    
    // ================ READ ==================
    
    // open file
    retval = nc_open("vlenFillValue.nc", NC_NOWRITE, &ncid);
    checkErrorCode(retval, "nc_open");
    
    nc_vlen_t* data_read = new nc_vlen_t[DATA_LENGTH];
    
    retval = nc_get_var(ncid, varid, data_read);
    checkErrorCode(retval, "nc_get_var");
    
    retval = nc_close(ncid);
    checkErrorCode(retval, "nc_close (2)");
    
    return retval;
}

Here is the output (this was run on macOS 11.2.3, but we see the issue on other OS's too):

$ ./a.out 
Error code: -101 from nc_get_var
NetCDF: HDF error

I see that ncdump also errors out on the produced file:

$ ncdump vlenFillValue.nc 
netcdf vlenFillValue {
types:
  double(*) RAGGED_DOUBLE ;
dimensions:
	xdim = 4 ;
variables:
	RAGGED_DOUBLE var(xdim) ;
		RAGGED_DOUBLE var:_FillValue = {0, 101} ;
data:

NetCDF: HDF error

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions