Skip to content

Move ESGF scrape to GitHub actions#276

Merged
znichollscr merged 14 commits intomainfrom
auto-scrape-esgf
Jul 10, 2025
Merged

Move ESGF scrape to GitHub actions#276
znichollscr merged 14 commits intomainfrom
auto-scrape-esgf

Conversation

@znichollscr
Copy link
Copy Markdown
Collaborator

@znichollscr znichollscr commented Jul 10, 2025

Description

Move ESGF scrape to GitHub actions

An example of the PRs it creates: #267

Fix #115

Checklist

Please confirm that this pull request has done the following:

  • Data released on ESGF
  • ESGF update pulled in here
  • Documentation added (where applicable)
  • Changelog item added to changelog/
  • Did a new release after merging

@github-actions
Copy link
Copy Markdown

No changes to the database between 'main' branch and 7979031

@github-actions
Copy link
Copy Markdown

No changes to the database between 'main' branch and e7c5436

Copy link
Copy Markdown
Collaborator

@durack1 durack1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@znichollscr nice, if this can be enabled we can monitor to see how it evolves.

Just thinking aloud, as we are the only ones that are publishing to the project, it's a little overkill to have this triggering regularly, we only need this to happen when some input4MIPs status changes, and currently, none of this is automatic without our explicit involvement

@github-actions
Copy link
Copy Markdown

No changes to the database between 'main' branch and 59d68ae

@znichollscr
Copy link
Copy Markdown
Collaborator Author

Just thinking aloud, as we are the only ones that are publishing to the project, it's a little overkill to have this triggering regularly, we only need this to happen when some input4MIPs status changes, and currently, none of this is automatic without our explicit involvement

I agree. I've got it happening once per day at the moment. We can also make it less often e.g. once per week or only on manual trigger. Let me know if you have a strong feeling.

@durack1
Copy link
Copy Markdown
Collaborator

durack1 commented Jul 10, 2025

Let me know if you have a strong feeling.

My strong feeling would be to automate this at a high frequency now, and see how often it breaks. We need to start testing the system before CMIP7 pressure starts to accumulate, and we're in a perfect position to do this on a comparatively tiny live project - my NERSC/perlmutter cron is 6 hrly, why not do the same here and start logging success/failures?

@github-actions
Copy link
Copy Markdown

No changes to the database between 'main' branch and 3252ddc

@znichollscr znichollscr merged commit aa0ebde into main Jul 10, 2025
7 of 8 checks passed
@durack1
Copy link
Copy Markdown
Collaborator

durack1 commented Jul 10, 2025

@znichollscr
Copy link
Copy Markdown
Collaborator Author

znichollscr commented Jul 10, 2025 via email

@durack1
Copy link
Copy Markdown
Collaborator

durack1 commented Jul 10, 2025

My NERSC scrape is firing off as expected, but the time taken has been interesting to watch, and also had a couple of outages/failures. Definitely something to keep an eye on over an extended period.

perlmutter:login31:~> tail -n 20 /global/cfs/projectdirs/m4931/gsharing/user_pub_work/input4MIPs/esgf-input4MIPs.log
...
Sat 05 Jul 2025 11:04:27 AM PDT
Sat 05 Jul 2025 05:00:10 PM PDT
Sat 05 Jul 2025 11:00:10 PM PDT
Sun 06 Jul 2025 05:00:17 AM PDT
Sun 06 Jul 2025 11:00:17 AM PDT
Sun 06 Jul 2025 05:00:18 PM PDT
Sun 06 Jul 2025 11:00:17 PM PDT
Mon 07 Jul 2025 05:00:18 AM PDT
Mon 07 Jul 2025 11:02:01 AM PDT
Mon 07 Jul 2025 05:01:57 PM PDT
Mon 07 Jul 2025 11:00:15 PM PDT
Tue 08 Jul 2025 05:00:24 AM PDT
Tue 08 Jul 2025 11:00:11 AM PDT
Tue 08 Jul 2025 05:00:19 PM PDT
Tue 08 Jul 2025 11:05:47 PM PDT
Wed 09 Jul 2025 05:07:21 AM PDT
Wed 09 Jul 2025 11:05:27 AM PDT
Wed 09 Jul 2025 05:06:10 PM PDT
Wed 09 Jul 2025 11:07:26 PM PDT
Thu 10 Jul 2025 05:09:42 AM PDT

@durack1 durack1 deleted the auto-scrape-esgf branch July 10, 2025 13:58
@durack1
Copy link
Copy Markdown
Collaborator

durack1 commented Jul 10, 2025

Looks like it triggered, as expected - without any changes, it's a little hard to ascertain if it worked - are there logs somewhere? All I could see is the job info at https://github.com/PCMDI/input4MIPs_CVs/actions/workflows/update-esgf-scrape.yaml

@znichollscr
Copy link
Copy Markdown
Collaborator Author

znichollscr commented Jul 10, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants