Problems with reading "big" arrays (>8.1Gb)

**Describe the bug**
I have hit a reproducible error where big arrays (>8.1Gb) are not read correctly, rather with a zero array (rather than real numbers) being returned. I was a little puzzled by this error, and got talking with @painter1 who also had this problem and reported it back via email in May 2019. It turns out that the issue is with arrays greater than 8.1Gb, with the original error a bug with libnetcdf versions for big variables (from @painter1's notes/emails). @dnadeau4 and @doutriaux1 may recall some of the specific details about this. I note I may not be using the latest versions of libraries below.

**To Reproduce**
Steps to reproduce the behavior:
1. Install CDAT with: `cdms2-3.1.4-py37ha6f5e91_3`, `libnetcdf-4.6.2-h303dfb8_1003`, `netcdf-fortran-4.4.5-h0789656_1004`
2. Execute the code attached (which reads larger and larger arrays)
3. Watch as some summary stats go from real numbers to 0's when the arrays being read are >8Gb, which for the demo below happens at year 1989 (3rd step of the loop) when 26 years of data are being read (with the model having a vert/horiz grid of 60 vertical levels, 384 lat, 320 lon).

**Expected behavior**
Big arrays should be read validly, returning non-zero arrays

**Desktop (please complete the following information):**
 - OS: RHEL7.x

The code to reproduce this:
```python
# imports
import sys
import cdat_info
import cdms2 as cdm
import numpy as np
from socket import gethostname

#%% Define function
def calcAve(var):
    print('type(var);',type(var),'; var.shape:',var.shape)
    # Start querying stat functions
    print('var.min():'.ljust(21),var.min())
    print('var.max():'.ljust(21),var.max())
    print('np.ma.mean(var.data):',np.ma.mean(var.data)) ; # Not mask aware
    # Problem transientVariable.mean() function
    #print('var.mean():'.ljust(21),var.mean())
    print('-----')

#%% Load subset of variable
f = ['/p/css03/esgf_publish/CMIP6/CMIP/NCAR/CESM2/historical/r1i1p1f1/Omon/so/gn/v20190308/so_Omon_CESM2_historical_r1i1p1f1_gn_185001-201412.nc']
# Try building up arrays stepping in a single year
times = np.arange(1991,1984,-1)
print('host:',gethostname())
print('Python version:',sys.version)
print('cdat env:',sys.executable.split('/')[5])
print('cdat version:',cdat_info.version()[0])
print('*****')
for timeSlot in times:
    for filePath in f:
        fH = cdm.open(filePath)
        print('filePath:',filePath.split('/')[-1])
        # Loop through single years
        start = timeSlot ; end = 2014
        print('times:',start,end,'; total years:',(end-start)+1)
        d1 = fH('so',time=(str(start),str(end)))
        print("Array size: %d Mb" % ( (d1.size * d1.itemsize) / (1024*1024) ) )
        calcAve(d1)
        del(d1)
        fH.close()
    print('----- -----')
```

@pochedls @muryanto1 @downiec @jasonb5 @gabdulla @gleckler1 @lee1043 ping

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problems with reading "big" arrays (>8.1Gb) #383

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Problems with reading "big" arrays (>8.1Gb) #383

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions