Switching to the DataCite REST API for retrieving registration metadata #12270
Merged
stevenwinship merged 7 commits intodevelopfrom Apr 6, 2026
Merged
Switching to the DataCite REST API for retrieving registration metadata #12270stevenwinship merged 7 commits intodevelopfrom
stevenwinship merged 7 commits intodevelopfrom
Conversation
DOI metadata from DataCite. This is to address an apparent issue with UTF8 characters when relying on the MDS API used traditionally. (#12070)
This comment has been minimized.
This comment has been minimized.
Removing the old-style constructors that are no longer needed. #12070
This comment has been minimized.
This comment has been minimized.
Contributor
Author
|
Added a detailed "how to test" in the PR description. |
qqmyers
approved these changes
Apr 6, 2026
This comment has been minimized.
This comment has been minimized.
Contributor
Author
|
(Note that it says that the last Jenkins run failed - as of writing this, Apr. 6, 12:18PM - but that's because I killed that build as unnecessary; as it was triggered by a cosmetic comment change. The last Jenkins test that actually ran, number 3, did pass) |
This comment has been minimized.
This comment has been minimized.
1 similar comment
|
📦 Pushed preview images as 🚢 See on GHCR. Use by referencing with full name as printed above, mind the registry name. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does / why we need it:
This is to address an issue with UTF8 characters when relying on the traditionally used MDS API, which results in unnecessary registration updates. See #12070.
This is a very minimal, proof of concept implementation. It appears to be working for its intended purpose.
Which issue(s) this PR closes:
Special notes for your reviewer:
As discussed during review, a release note has been added telling instances that having a valid REST API url configured is now a requirement. As an extra fail-safe
getMetadata()will fall back to MDS if it cannot obtain the metadata via REST API.We do not have any tests covering DataCiteRESTfullClient and none are added in this PR. The only practical/useful way of testing this functionality I can think of is to make the RestAssured tests rely on a "real" (non-"fake") DataCite authority and test registering real DOIs. I am hesitant however to introduce another dependency on an external service in that suite.
Suggestions on how to test this:
Can be easily tested on any dev. instance. But a real test DataCite authority must be used ("fake" will not do, in other words), as in:
reach out directly on slack if you don't have the username/password.
Create a dataset; put something/anything in the description that has UTF8 characters in it. Like the
Universidade de Brasíliaetc. in the issue description.Publish the dataset. Check on https://doi.test.datacite.org/repositories/gdcc.harvard-test and confirm that the DataCite registration has worked. (Keep in mind that these test DOIs do not redirect using the normal DOI resolver shown on the dataset page)
Enable FINE logging on
edu.harvard.iq.dataverse.pidproviders.doi.datacite.level=FINESet
<jvm-options>-Ddataverse.feature.only-update-datacite-when-needed=true</jvm-options>, if not present, restart payara.Testing the "before" case, i.e. the develop branch or 6.10:
run the
/modifyRegistrationMetadataapi on the dataset.You will see messages in the log indicating that the metadata needed to be updated (not true!) and that the DOI has been re-registered. There should be messages in the log indicating that the differences between the local metadata and (what it thinks) is registered w/ DataCite are due to the UTF8 characters in the fields.
Testing "after", w/ this PR deployed:
run
/modifyRegistrationMetadataYou will see a confirmation in the log that the metadata registered with DataCite is already up-to-date, so there was no need to re-registger.
Does this PR introduce a user interface change? If mockups are available, please link/include them here:
Is there a release notes update needed for this change?:
Additional documentation: