Skip to content

Remove CMOR export#222

Merged
ltroussellier merged 1 commit intoESGF:integrationfrom
znichollscr:remove-cmor
Apr 13, 2026
Merged

Remove CMOR export#222
ltroussellier merged 1 commit intoESGF:integrationfrom
znichollscr:remove-cmor

Conversation

@znichollscr
Copy link
Copy Markdown
Contributor

It's too project specific to be useful as a general tool so I'd recommend pulling it out.

Straw that broke the camel's back: WCRP-CMIP/cmip7-cmor-tables#66 (comment)

@znichollscr
Copy link
Copy Markdown
Contributor Author

Continuing the conversation from WCRP-CMIP/cmip7-cmor-tables#66 (comment) (relevant part copied below). glevava said:

I find it a bit unfortunate to completely move an “application” out of esgvoc on the grounds that such a small detail would be too project-specific. This suggests to me that the issue is more about configuration than about the application being inherently too CMIP7-specific. But I may be wrong.

Fair comment. It's actually not the small detail so much. It's more that this small detail along with other learnings has led me to realise two key things:

  1. The CMOR CVs table can/does require information that is beyond what esgvoc should capture. The choice of what level of difference from the frequency interval is tolerated is not controlled vocabulary, it's a project-specific choice (the only general CMIP home I can think of would be QA/QC). So esgvoc by itself should never be able to create a complete CMOR table, it doesn't have all the information required
  2. The CMOR CVs table is not just about exporting CVs for CMOR. It also defines CMOR's behaviour. If a given key is a single value, CMOR will always enforce that value in the files it writes. If it is a list, CMOR just checks that the value in the file is from that list. If it is a dictionary, it will sometimes just use the keys as a list from which to check (ignoring the values), sometimes it expects specific keys to be in that dictionary (e.g. for frequency), sometimes the keys can be there or not, it's up to the user (that's what happens with source ID, if source ID has an institution ID key, then CMOR checks the institution ID, given the source ID, but if institution ID isn't there, it just does nothing). So this CMOR CVs table is a combination of a) CVs values b) configuration for writing attributes in the file based on the CVs an c) configuration for the checks that CMOR does. Given this multi-purpose to the CVs table (the design validity of which is a different question, let's not go there), esgvoc cannot write it with confidence. It will always require user input to actually specify the output because the user has some choice about the content (particularly structure) of the data, as it is this structure which defines how CMOR will behave (and that's something the user should control).

Now, as you say, we could just add configuration so that esgvoc can take in this user spec, combine it with the CVs and generate CMOR tables. My main issue with this is that I don't know what features CMOR tables are meant to support (and, despite repeated asking, have never been given a straight answer) so I couldn't build such a workflow. If someone else wants to, I won't object (but I also won't help, as I think this behaviour should be in the specific CMOR tables repository, as that is where it belongs while the user case is still being figured out). My second issue is that I think this expands esgvoc's remit in a problematic way: essentially esgvoc would be committing to supporting CMOR table export, which means keeping up with changes to CMOR, which means coupling yourselves to CMOR. I wouldn't do this, but again, if you want to keep the behaviour, I won't object (I just also won't help, this coupling is asking for trouble for a variety of reasons (moving target and bad scoping be the two major ones) in my opinion).

@znichollscr
Copy link
Copy Markdown
Contributor Author

Also, the current implementation is entirely CMIP7 specific. It would break on any other project (and already has, I think). I think it's better to remove this to avoid giving the impression that we have general CMOR CVs table export capability than leave it in there and have it just not work when people actually try to use it with CMOR.

@glevava
Copy link
Copy Markdown
Member

glevava commented Apr 13, 2026

Fair answer 🙂

I fully agree with all your points. My question was really aimed at improving my understanding of CMOR (which is still somewhat limited) and the CMOR tables that drive its behavior.

So I agree with your conclusion that the needs of CMOR tables are quite far from the scope of esgvoc, and therefore do not really fit as an “application” within it. As you rightly pointed out, the key principle is whether esgvoc has all the required information for a given use case though the CVs. If it does, then it makes sense as an application; if not, then esgvoc should instead be used as a dependency or API within a higher-level tool that can also incorporate the additional information required by CMOR.

I tend to see esgvoc “apps” as a way to provide higher-level commands that simplify recurring use cases. But indeed, a key constraint is that these should rely solely on the CVs. This is already the case, for example, when validating filenames, directory structures, global attributes, or when generating the ESGF JSON schema.

@ltroussellier ltroussellier merged commit ccdc84f into ESGF:integration Apr 13, 2026
@ltroussellier
Copy link
Copy Markdown
Collaborator

im not fan of the idea, and will probably be harder for other project. But arguments were convincing
the need of other sources (tahn esgvoc) call for removing the generation from esgvoc

@znichollscr znichollscr deleted the remove-cmor branch April 14, 2026 02:14
@znichollscr
Copy link
Copy Markdown
Contributor Author

Hopefully this makes you a bit happier @ltroussellier: #224

@ltroussellier ltroussellier mentioned this pull request Apr 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants