This repository began as prototype code for visualising Bureau of Meteorology (BOM) Grid 2 Grid (G2G) land runoff modelling data alongside the eReefs Hydrodynamic model (CSIRO GBR4 Hydro v4).
The same codebase was then refined and used to generate the final published data and visualisation products.
The repository name, ereefs-g2g-prototype-visualisation, has been retained for continuity with earlier development
and internal references.
For the published dataset record and citation details, refer to: https://doi.org/10.26274/0K5S-5E13
This script uses draft G2G model data that has a limited geographic scope. The G2G modelling was developed primarily to generate correct river flows along the boundary to the GBR, as a driving input to the eReefs Hydrodynamic model. It was not originally intended to accurately characterise the transient flows far up the catchments. That is the flows along the coastline are calibrated, but the flows far up the catchments are not. The flows across the landscape are driven by rain events but are also constrained by gauged flows in the rivers. This correction in the flows can occasionally be seen in the visualisations, where a part of the catchment shows a small flow that seems to stop and go nowhere. This is a result of the correction being applied from the gauged flow location. This is more obvious for low flow events, where the G2G slightly overestimates the overland flow in some sub-catchments. These get corrected down the catchment at the gauged stations.
It should also be noted that the G2G flows being visualised were not the matching river flows used in the GBR4 Hydro v4 modelling. The river flows in the Hydro modelling are based on scaled (to compensate for the approximate proportion of the catchment that is not directly measured) gauge stations.
The final published products generated by these scripts are available at: QLD_BOM_eReefs-g2gflow_2011-2023
This web storage location includes both the visualisation products and the daily G2G data files.
If you want end-to-end automation, use the batch scripts:
generate-and-publish-daily-g2g.shfor daily G2G generation + S3 publishinggenerate-videos.shfor annual video generation
For required environment variables and .env setup, jump to Batch helper scripts below.
The basemap data used in these plots is available from eReefs Basemap - GIS layers Reefs, Rivers, Cities, Basins, Countries (AIMS).
The eReefs data used to generate these plots comes from the GBR4 v4 daily OpenDAP endpoint: https://thredds.ereefs.aims.gov.au/thredds/dodsC/gbr4_v4/daily.nc
This code is made available under an MIT license and the plots are make available under a Creative Commons Attribution 4.0 license, with attribution: Eric Lawrey, AIMS.
The following are instructions for reproducing these plots.
Create a Python Virtual Environment. I would recommend using Anaconda on Windows.
Certainly! Here's a step-by-step guide to set up a virtual environment in Anaconda on Windows and install the specified libraries:
-
Install Anaconda (If you haven't already):
- Download the Anaconda installer for Windows from here.
- Follow the on-screen prompts to install Anaconda.
- After installation, ensure that the Anaconda binaries are in your system's PATH or use the Anaconda Prompt for all the commands below.
-
Open Anaconda Prompt:
- Search for "Anaconda Prompt" in your Windows search bar and open it. This command prompt has all the necessary configurations set for Anaconda.
-
Create a new virtual environment:
-
Use the following command to create a new virtual environment. Here, I'll name it
ereefs_maps, but you can give it any name you prefer.conda create --name ereefs_maps python=3.10
-
You can replace
3.10with your preferred Python version, but make sure the libraries you want to install are compatible with that version.
-
-
Activate the virtual environment:
-
Once the environment is created, activate it using the following command:
conda activate ereefs_maps
-
-
Install Libraries from the
requirements.txtFile:Navigate to the directory where your
requirements.txtis located (using thecdcommand). Then, use the following command to install the packages:pip install -r requirements.txt
This will install all the libraries specified in the
requirements.txtfile into yourereefs_mapsenvironment. -
Verify the installations:
-
You can check that the libraries have been installed correctly by activating the environment and then launching Python:
python
-
And then, for each library in the requirements try and import them:
import xarray import geopandas import matplotlib import cartopy ...
-
If no errors pop up after these import statements, it means the libraries are correctly installed.
-
-
Run the scripts in order: First run the one-time preprocessing script
00-generate-daily-g2g-aggregates.pyto generate daily aggregate G2G files for public hosting. Then run01-download-base-map-data.py,02-get-daily-ereefs-hydro-data.py,03-download-daily-g2g-data.py, and04-animate-G2G-and-salinity.py. -
Deactivate the environment when done:
-
When you're done working in the
ereefs_mapsenvironment, deactivate it with:conda deactivate
-
That's it! You now have a virtual environment set up in Anaconda on Windows.
Prior to reproducing the plots you will need access to daily aggregated G2G NetCDF files. This repository now includes
an automated download step for those files using 03-download-daily-g2g-data.py. The base map data and the eReefs
Hydro v4 data can be obtained by running
the scripts 01-download-base-map-data.py and 02-get-daily-ereefs-hydro-data.py.
This script is a one-time preprocessing step that converts the source hourly G2G files
(src-data/g2g-data/extracted_files/<year>/sidb2netcdf_g2gflow_YYYY-MM-DD.nc) into
daily mean NetCDF files. It writes one output file per day to:
src-data/g2g-data/daily-aggregated/<year>/BOM_eReefs-g2gflow_daily_YYYY-MM-DD.nc
Features:
- processes all available years by default (or selected years)
- supports
--start-dateand--end-datefilters - skips files that already exist (safe restart behaviour)
- uses temporary output files (
.tmp.nc) and atomic rename for robustness
Example usage:
# Process all years found in src-data/g2g-data/extracted_files
python 00-generate-daily-g2g-aggregates.py
# Process selected years only
python 00-generate-daily-g2g-aggregates.py 2019 2020
# Process a date range
python 00-generate-daily-g2g-aggregates.py --start-date 2019-01-01 --end-date 2019-12-31This script downloads the shapefiles needed to make the basemap in the plots. The data can be manually downloaded in a
browser from: https://nextcloud.eatlas.org.au/s/RGwTFcLtmPApEcQ/download and extracted into
src-data/GBR_AIMS_eReefs-basemap.
This script downloads the eReefs Hydro data from the AIMS THREDDS data service using OpenDAP. It currently downloads
salinity (salt) from the GBR4 v4 endpoint at depth -1.5 m (mapped to the model k index through an internal lookup
table verified against the remote zc depth coordinate).
It supports:
yearpositional argument--start-dateand--end-date(YYYY-MM-DD) for partial-year downloads- automatic restart support via temporary files (
.tmp) and skip-if-exists logic
The download is cropped to a fixed bounding box aligned with the current G2G test extent:
- North:
-10.65 - South:
-29.30 - West:
141.8 - East:
155.8
This script is designed to cope with cancellation and resumption. If the script is cancelled mid-way through processing, then restarted, any data file that has already been downloaded will be not be redownloaded, speeding up the resumption.
This takes about 1-2 sec per day to download and 6.2 MB per day.
This script downloads daily aggregated G2G NetCDF files for a selected year from:
https://nextcloud.eatlas.org.au/public.php/dav/files/LiRXpzLFBCWPf4f/daily/g2gflow-data/{year}/?accept=zip
It extracts only files matching:
BOM_eReefs-g2gflow_daily_*.nc
to:
src-data/g2g-data/daily-aggregated/<year>/
Example usage:
python 03-download-daily-g2g-data.py 2019This script combines the G2G data with eReefs Hydro salinity into a single visualisation. It now supports multiple regions from one entry point using:
--regions queensland,north,central,south--preview-imageto export a PNG preview instead of an MP4
Additional behaviour now implemented in the script:
- data axes are normalized (sorted) before plotting to avoid raster orientation issues
- each raster layer uses its own geospatial extent, fixing cross-layer stretching/misalignment
- map extent can exceed data extent while data slicing remains clamped to available downloaded bounds
- salt layer is rendered underneath river flow
- salinity display range is fixed to
24to37PSU with even-number ticks (24, 26, ..., 36) - optional river-flow line thickening is applied at render time using a local max filter and supports fractional
thickness values (for example
0.5)
This script is designed to generate animations for a selected year, noting that matching salinity data must first be
downloaded using 02-get-daily-ereefs-hydro-data.py and daily G2G NetCDF files must be downloaded using
03-download-daily-g2g-data.py.
The repository also includes two shell scripts for end-to-end batch processing.
An example environment file is provided as .env.example.
Recommended setup:
cp .env.example .env
# Edit .env with your local/project valuesBefore running either helper script, load variables from .env into your shell:
set -a
source .env
set +aBuilds daily G2G NetCDF aggregates from source tar archives in S3 and uploads the generated daily files to a public S3 prefix.
Defaults:
- processes all years from
2011to2023when no year arguments are provided - removes temporary/downloaded local files after each year (
KEEP_LOCAL=falseby default)
Prerequisites:
- AWS CLI installed and authenticated
- permission to read source bucket and write destination prefix
- required environment variables:
SOURCE_S3_PREFIXDEST_S3_PREFIXSOURCE_AWS_PROFILEDEST_AWS_PROFILE
Usage examples:
# Process all years (2011-2023)
bash generate-and-publish-daily-g2g.sh
# Process selected years only
bash generate-and-publish-daily-g2g.sh 2019 2021
# Keep local extracted and generated files after upload
KEEP_LOCAL=true bash generate-and-publish-daily-g2g.sh 2019Runs the full video-generation workflow for a configured year range.
For each included year it:
- downloads eReefs Hydro salinity
- downloads daily G2G data
- renders animations
- deletes downloaded per-year data afterward to save disk space
Usage:
# Process all years in the configured range (YEAR_START to YEAR_END)
bash generate-videos.sh
# Process only selected years (e.g. 2019 and 2021)
bash generate-videos.sh 2019 2021The final videos can be found in the ./export folder.
Some assistance for the creation of these scripts was provided by GPT-4 using the Code interpreter and the normal GPT-4 chat.
I have recorded a summary of the key prompts that summarize the information that was requested from GPT-4.
I want to setup a virtual environment in anaconda on windows with the following libraries installed: xarray geopandas matplotlib cartopy netCDF4. Can you generate a set of instructions to do this. How would us do this with a requirements.txt document?
I want a Python script that will download and unzip a file from a specified URL. Can this be done with no additional libraries, just the ones already in Python 3.10. Can it also provide some feedback during the download and specify a user agent in the request.
I want to use OpenDAP to download a time series of a particular variable at a particular depth. The opendap service URL is https://thredds.ereefs.aims.gov.au/thredds/dodsC/gbr1_2.0/daily.nc.html. This service contains many variables, but I only want to download the Salinity variable 'salt' : Array of 32 bit Reals [time = 0..3181][k = 0..15][latitude = 0..4236][longitude = 0..2670]). I want to download and save the salt variable for a specific depth (k=14) as a single NetCDF file locally. I want it to provide some feedback during the downloading, by downloading one month at a time and saving to NetCDF, printing an update. I want to be able to specify the start and end date to be downloaded. Can you use xarray.
Follow up: I want to limit the spatial extend of the data to download to a bounding box (North:-20.75, South:-28.3, West:148.8, East:154)
Changes: I added code for selecting the depth index based on depth value. Added folder path for data in src-data. There was a lot of debugging caused by the selection of the dates. The fix seemed to be to add +14hours to align with the dates in the file. The other problem was the bounding box selection. GPT4 generated code with north and south swapped, causing empty files. I added code to allow better restarting of the download, by first downloading to a temporary file, then moving as a last step. It also skips files that are already downloaded.
This script was mainly based on an extension of a originally script developed by Ben Farmer. Various adjustments were made with the help of GPT-4, however this code was a bit big for it to handle.
