Overview
Our good friends at NASA's GSFC are having trouble with data that they are serving (through Hyrax-1.16.3) and accessing said data with ncdump. The problem is that the web machinery is rejecting ncdump's data retrieval request because the HTTP request header is too large: well in excess of the 8k limit found in web servers like Apache httpd and Tomcat. While Tomcat can configure this setting easily, it would appear that Apache httpd cannot, especially with regards to the crucial Apache Module mod_proxy_ajp which seems to be locked in at 8k
Example
Here's a working example from the same collection (granules contain ~300 variables):
ncdump -v 'RetrievalGeometry_retrieval_longitude' https://oco2.gesdisc.eosdis.nasa.gov/opendap/OCO2_L2_Standard.11.2r/2024/221/oco2_L2StdND_53744a_240808_B11205r_240921171946.h5
This dataset in the same collection and fails during ncdump's data retrieval phase:
ncdump -v "RetrievalGeometry_retrieval_longitude" https://oco2.gesdisc.eosdis.nasa.gov/opendap/OCO2_L2_Standard.11.2r/2024/107/oco2_L2StdND_52084a_240416_B11205r_240610060300.h5
Here's the output:
data:
syntax error, unexpected WORD_WORD, expecting SCAN_ATTR or SCAN_DATASET or SCAN_ERROR
context: <html^><head><title>414 Request-URI Too Large</title></head><body><center><h1>414 Request-URI Too Large</h1></center><hr><center>nginx/1.22.1</center></body></html>
NetCDF: Access failure
Location: file vardata.c; line 478
Users at NASA report that
The ncdump utility creates a resultant URL that is 13837 characters long beginning with “/opendap”
And that's a problem because web machinery like Apache httpd and the mod_proxy_ajp limit the request header size to 8k.
I think it may be an upstream swim to get the various web services configured to accept this behavior.
It would, imho, be better if ncdump would detect that the request URL path is too big and make multiple requests, each smaller than 8k.
Overview
Our good friends at NASA's GSFC are having trouble with data that they are serving (through Hyrax-1.16.3) and accessing said data with
ncdump. The problem is that the web machinery is rejectingncdump's data retrieval request because the HTTP request header is too large: well in excess of the 8k limit found in web servers like Apache httpd and Tomcat. While Tomcat can configure this setting easily, it would appear that Apache httpd cannot, especially with regards to the crucial Apache Modulemod_proxy_ajpwhich seems to be locked in at 8kExample
Here's a working example from the same collection (granules contain ~300 variables):
This dataset in the same collection and fails during
ncdump's data retrieval phase:Here's the output:
Users at NASA report that
And that's a problem because web machinery like Apache
httpdand themod_proxy_ajplimit the request header size to 8k.I think it may be an upstream swim to get the various web services configured to accept this behavior.
It would, imho, be better if ncdump would detect that the request URL path is too big and make multiple requests, each smaller than 8k.