Convention for reduced Gaussian grids
Moderator
Moderator Status Review [last updated: YYYY-MM-DD]
Requirement Summary
The reduced Gaussian grid is a widely used format in atmospheric and climate data, yet it is not currently represented by any of the grid types defined in the CF Conventions. Unlike regular grids, the reduced Gaussian grid has a varying number of longitude points per latitude, decreasing toward the poles. This irregularity makes it both more efficient for spectral transforms and more complex to describe with existing CF structures. Because no existing CF grid mapping can capture these characteristics, a new convention is proposed to ensure consistent representation, efficient storage, and interoperability across software and datasets that rely on this grid type.
Technical Proposal Summary
Following discussions between NCAS, University of Reading (@JonathanGregory, @davidhassell and @sadielbartholomew) and ECMWF (@sebvi, @LoreaGSM), we propose the following approach for incorporating reduced Gaussian grids into the CF Conventions:
As the grid is not a regular but complex 2D structure, storing full latitude-longitude vectors is inefficient. Instead, the grid can be defined by:
• The latitudes, which correspond to the zeros of the Legendre polynomial of order 2N, where N is the number of latitude lines from pole to equator. These grids exclude points at the poles and equator.
• The number of longitude points per latitude as defined above, stored in a points-per-latitude vector. Longitudes can then be reconstructed using a known formula. The first longitude is always 0 for each of the latitude lines.
• The grid subtype (e.g., Octahedral, Normal) and the number of latitudes from pole to equator (the “resolution” of the grid).
The reduced Gaussian grids should also include an indexing vector (reduced_gaussian_index) in global or non-global domains.This indexing vector can be stored with a reduced size by the compression by coordinate subsampling (CF chapter 8.3). In the particular case of a subdomain or a masked domain, data is given only for the non missing points. These indexes of these grid-cells are referenced to the global index (reduced_gaussian_index) and compression by gathering (CF chapter 8.2) can be used. The pl and lat vectors are always global.
Examples
Example for the global most general case:
dimensions:
reduced_gaussian_index = 6599680 ;
n_lats = 2560 ;
variables:
char reduced_gaussian ;
reduced_gaussian:grid_mapping_name = “reduced_gaussian” ;
reduced_gaussian:grid_subtype = "octahedral” ;
reduced_gaussian:grid_resolution = 1280 ;
reduced_gaussian:points_per_latitude = “pl” ;
reduced_gaussian:latitudes = “lat” ;
float lat(n_lats) ;
lat:units = "degrees_north" ;
lat:long_name = "latitude" ;
int pl(n_lats) ;
pl:long_name = "number of points per latitude" ;
pl:units = “1” ;
int reduced_gaussian_index(reduced_gaussian_index) ;
reduced_gaussian_index:long_name = “Reduced Gaussian grid point index”;
reduced_gaussian_index:units = “1”;
float data(reduced_gaussian_index) ;
data:grid_mapping = “reduced_gaussian” ;
data:coordinates = “reduced_gaussian_index” ;
data:
latitude = …. ; # vector of 2560 values of latitude
points_per_latitude = … ; # vector of 2560 values of number of points per each latitude
data = … ; # variable with 6599680 values
reduced_gaussian_index = ... ; # vector with values [0, …, 6599679]
Either points_per_latitude (pl) or the accumulated_points_per_latitude (accum_pl) vectors can be provided indistinctively, the latter being the cumulative sum of the first one. Both vectors define the grid, but the accumulated vector may ease user computations of latitudes and longitudes. The points-per-latitude vectors and the latitude both have a length of twice the grid_resolution (2560 for grid_resolution=1280).
The reduced_gaussian_index can be stored with a reduced size using the compression by coordinate subsampling (CF chapter 8.3)
Example for the regional case or a masked global (a particular case of the global example):
dimensions:
cells = 3 ;
n_lats = 2560 ;
variables:
char reduced_gaussian ;
reduced_gaussian:grid_mapping_name = “reduced_gaussian” ;
reduced_gaussian:grid_subtype = "octahedral” ;
reduced_gaussian:grid_resolution = 1280 ;
reduced_gaussian:points_per_latitude = “pl” ;
reduced_gaussian:latitudes = “lat” ;
float lat(n_lats) ;
lat:units = "degrees_north" ;
lat:long_name = "latitude" ;
int pl(n_lats) ;
pl:long_name = "number of points per latitude" ;
pl:units = “1” ;
int cells(cells);
cells:compress = “reduced_gaussian_index”;
cells:long_name = “indices of reduced_gaussian_index for gathered data”
float data(cells) ;
data:grid_mapping = “reduced_gaussian” ;
data:
cells= 3507, 6789, 10689 ;
latitude = …. ; # vector of 2560 values of latitude
points_per_latitude = … ; # vector of 2560 values of number of points per each latitude
data = 10, 20, 30 ; # variable with 3 non missing values
In this case, only 3 points have valid data from the 6599680 of the global case, and the cells vector represents the position in the global vector. The pl and lat vectors in the grid definition are referred to as the global case.
Calculations of the latitude and longitude for a particular grid cell from the points_per_latitude vector:
For a given grid cell point defined by index i, the latitude index k can be calculated as the minimum k-th element so that the accumulation points_per_latitude vector (accum_pl) for that element is higher than i. The longitude index is then calculated for that latitude index as the remaining number of elements in the following latitude line.
$$
{accum}\_{pl}[0]= pl[0]
$$
$$
{accum}\_{pl}[n] = {accum}\_{pl}[n-1] + pl[n] , \quad n \in {1, \dots, n\_{lats}-1}
$$
Then:
$$
k = \min \{ s \mid {accum}\_{pl}[s] > i \}
$$
$$
m = i - {accum}\_{pl}[k-1] - 1
$$
The latitude for that grid point would be the k-th element of the latitude vector, and the longitude would be the longitude index times the increment of longitude for that latitude line.
$$ lat_i = lat[k], \quad lon_i = 0^\circ + m \cdot \frac{360}{pl[k]}
$$
Longitude and latitude bounds are located in the middle point between each consecutive longitude and each consecutive latitude.
Benefits
The reduced Gaussian grid is the native format for most ECMWF data products. With the upcoming ERA6 reanalysis release (expected late 2026 or early 2027), a substantial volume of data will be disseminated using this type of grid in GRIB2 but also NetCDF. Users will widely benefit from this proposal.
Status Quo
At present, the reduced Gaussian grid is not supported by CF, and a large amount of data produced in this type of grid (ERA6 and ECMWF products) would not be able to be covered by the convention.
Associated pull request
A pull request has not yet been created.
Detailed Proposal
All covered above.
Convention for reduced Gaussian grids
Moderator
Moderator Status Review [last updated: YYYY-MM-DD]
Requirement Summary
The reduced Gaussian grid is a widely used format in atmospheric and climate data, yet it is not currently represented by any of the grid types defined in the CF Conventions. Unlike regular grids, the reduced Gaussian grid has a varying number of longitude points per latitude, decreasing toward the poles. This irregularity makes it both more efficient for spectral transforms and more complex to describe with existing CF structures. Because no existing CF grid mapping can capture these characteristics, a new convention is proposed to ensure consistent representation, efficient storage, and interoperability across software and datasets that rely on this grid type.
Technical Proposal Summary
Following discussions between NCAS, University of Reading (@JonathanGregory, @davidhassell and @sadielbartholomew) and ECMWF (@sebvi, @LoreaGSM), we propose the following approach for incorporating reduced Gaussian grids into the CF Conventions:
As the grid is not a regular but complex 2D structure, storing full latitude-longitude vectors is inefficient. Instead, the grid can be defined by:
• The latitudes, which correspond to the zeros of the Legendre polynomial of order 2N, where N is the number of latitude lines from pole to equator. These grids exclude points at the poles and equator.
• The number of longitude points per latitude as defined above, stored in a points-per-latitude vector. Longitudes can then be reconstructed using a known formula. The first longitude is always 0 for each of the latitude lines.
• The grid subtype (e.g., Octahedral, Normal) and the number of latitudes from pole to equator (the “resolution” of the grid).
The reduced Gaussian grids should also include an indexing vector (reduced_gaussian_index) in global or non-global domains.This indexing vector can be stored with a reduced size by the compression by coordinate subsampling (CF chapter 8.3). In the particular case of a subdomain or a masked domain, data is given only for the non missing points. These indexes of these grid-cells are referenced to the global index (reduced_gaussian_index) and compression by gathering (CF chapter 8.2) can be used. The pl and lat vectors are always global.
Examples
Example for the global most general case:
Either points_per_latitude (pl) or the accumulated_points_per_latitude (accum_pl) vectors can be provided indistinctively, the latter being the cumulative sum of the first one. Both vectors define the grid, but the accumulated vector may ease user computations of latitudes and longitudes. The points-per-latitude vectors and the latitude both have a length of twice the grid_resolution (2560 for grid_resolution=1280).
The reduced_gaussian_index can be stored with a reduced size using the compression by coordinate subsampling (CF chapter 8.3)
Example for the regional case or a masked global (a particular case of the global example):
In this case, only 3 points have valid data from the 6599680 of the global case, and the cells vector represents the position in the global vector. The pl and lat vectors in the grid definition are referred to as the global case.
Calculations of the latitude and longitude for a particular grid cell from the points_per_latitude vector:
For a given grid cell point defined by index i, the latitude index k can be calculated as the minimum k-th element so that the accumulation points_per_latitude vector (accum_pl) for that element is higher than i. The longitude index is then calculated for that latitude index as the remaining number of elements in the following latitude line.
Then:
The latitude for that grid point would be the k-th element of the latitude vector, and the longitude would be the longitude index times the increment of longitude for that latitude line.
Longitude and latitude bounds are located in the middle point between each consecutive longitude and each consecutive latitude.
Benefits
The reduced Gaussian grid is the native format for most ECMWF data products. With the upcoming ERA6 reanalysis release (expected late 2026 or early 2027), a substantial volume of data will be disseminated using this type of grid in GRIB2 but also NetCDF. Users will widely benefit from this proposal.
Status Quo
At present, the reduced Gaussian grid is not supported by CF, and a large amount of data produced in this type of grid (ERA6 and ECMWF products) would not be able to be covered by the convention.
Associated pull request
A pull request has not yet been created.
Detailed Proposal
All covered above.