Skip to content

Commit f02ad47

Browse files
authored
Merge pull request #9568 from IQSS/9148-license-via-api
9148 Ensure that license/terms are set via API (create/update/publish)
2 parents b3a362a + 0bf3667 commit f02ad47

18 files changed

Lines changed: 262 additions & 12 deletions

File tree

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# License management via API
2+
3+
See https://github.com/IQSS/dataverse/issues/9148.
4+
5+
When publishing a dataset via API, it now requires the dataset to either have a standard license configured, or have valid Custom Terms of Use (if allowed by the instance). Attempting to publish a dataset without such **will fail with an error message**. This introduces a backward incompatibility, and if you have scripts that automatically create, update and publish datasets, this last step may start failing. Because, unfortunately, there were some problems with the datasets APIs that made it difficult to manage licenses, so an API user was likely to end up with a dataset missing either of the above. In this release we have addressed it by making the following fixes:
6+
7+
We fixed the incompatibility between the format in which license information was *exported* in json, and the format the create and update APIs were expecting it for *import* (https://github.com/IQSS/dataverse/issues/9155). This means that the following json format can now be imported:
8+
```
9+
"license": {
10+
"name": "CC0 1.0",
11+
"uri": "http://creativecommons.org/publicdomain/zero/1.0"
12+
}
13+
```
14+
However, for the sake of backward compatibility the old format
15+
```
16+
"license" : "CC0 1.0"
17+
```
18+
will be accepted as well.
19+
20+
We have added the default license (CC0) to the model json file that we provide and recommend to use as the model in the Native API Guide (https://github.com/IQSS/dataverse/issues/9364).
21+
22+
And we have corrected the misleading language in the same guide where we used to recommend to users that they select, edit and re-import only the `.metadataBlocks` fragment of the json metadata representing the latest version. There are in fact other useful pieces of information that need to be preserved in the update (such as the `"license"` section above). So the recommended way of creating base json for updates via the API is to select *everything but* the `"files"` section, with (for example) the following `jq` command:
23+
24+
```
25+
jq '.data | del(.files)'
26+
```
27+
28+
Please see the [Update Metadata For a Dataset](https://guides.dataverse.org/en/latest/api/native-api.html#update-metadata-for-a-dataset) section of our Native Api guide for more information.

doc/sphinx-guides/source/_static/api/dataset-update-metadata.json

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,8 @@
11
{
2+
"license": {
3+
"name": "CC0 1.0",
4+
"uri": "http://creativecommons.org/publicdomain/zero/1.0"
5+
},
26
"metadataBlocks": {
37
"citation": {
48
"displayName": "Citation Metadata",

doc/sphinx-guides/source/api/native-api.rst

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1049,23 +1049,26 @@ The fully expanded example above (without environment variables) looks like this
10491049
10501050
curl -H "X-Dataverse-key: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -X PUT https://demo.dataverse.org/api/datasets/:persistentId/versions/:draft?persistentId=doi:10.5072/FK2/BCCP9Z --upload-file dataset-update-metadata.json
10511051
1052-
Note that in the example JSON file above, there is a single JSON object with ``metadataBlocks`` as a key. When you download a representation of your dataset in JSON format, the ``metadataBlocks`` object you need is nested inside another object called ``datasetVersion``. To extract just the ``metadataBlocks`` key when downloading a JSON representation, you can use a tool such as ``jq`` like this:
1052+
Note that in the example JSON file above, there are only two JSON objects with the ``license`` and ``metadataBlocks`` keys respectively. When you download a representation of your latest dataset version in JSON format, these objects will be nested inside another object called ``data`` in the API response. Note that there may be more objects in there, in addition to the ``license`` and ``metadataBlocks`` that you may need to preserve and re-import as well. Basically, you need everything in there except for the ``files``. This can be achived by downloading the metadata and selecting the sections you need with a JSON tool such as ``jq``, like this:
10531053

10541054
.. code-block:: bash
10551055
10561056
export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
10571057
export SERVER_URL=https://demo.dataverse.org
10581058
export PERSISTENT_IDENTIFIER=doi:10.5072/FK2/BCCP9Z
10591059
1060-
curl -H "X-Dataverse-key: $API_TOKEN" $SERVER_URL/api/datasets/:persistentId/versions/:latest?persistentId=$PERSISTENT_IDENTIFIER | jq '.data | {metadataBlocks: .metadataBlocks}' > dataset-update-metadata.json
1061-
1060+
curl -H "X-Dataverse-key: $API_TOKEN" $SERVER_URL/api/datasets/:persistentId/versions/:latest?persistentId=$PERSISTENT_IDENTIFIER | jq '.data | del(.files)' > dataset-update-metadata.json
1061+
10621062
The fully expanded example above (without environment variables) looks like this:
10631063

10641064
.. code-block:: bash
10651065
10661066
curl -H "X-Dataverse-key: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" https://demo.dataverse.org/api/datasets/:persistentId/versions/:latest?persistentId=doi:10.5072/FK2/BCCP9Z | jq '.data | {metadataBlocks: .metadataBlocks}' > dataset-update-metadata.json
10671067
1068-
Now that the resulting JSON file only contains the ``metadataBlocks`` key, you can edit the JSON such as with ``vi`` in the example below::
1068+
1069+
Now you can edit the JSON produced by the command above with a text editor of your choice. For example, with ``vi`` in the example below.
1070+
1071+
Note that you don't need to edit the top-level fields such as ``versionNumber``, ``minorVersonNumber``, ``versionState`` or any of the time stamps - these will be automatically updated as needed by the API::
10691072

10701073
vi dataset-update-metadata.json
10711074

scripts/api/data/dataset-create-new.json

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,10 @@
44
"persistentUrl": "http://dx.doi.org/10.5072/FK2/9",
55
"protocol": "chadham-house-rule",
66
"datasetVersion": {
7+
"license": {
8+
"name": "CC0 1.0",
9+
"uri": "http://creativecommons.org/publicdomain/zero/1.0"
10+
},
711
"metadataBlocks": {
812
"citation": {
913
"displayName": "Citation Metadata",
@@ -121,4 +125,4 @@
121125
}
122126
}
123127
}
124-
}
128+
}

scripts/api/data/dataset-finch1_fr.json

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,10 @@
11
{
22
"metadataLanguage": "fr",
33
"datasetVersion": {
4+
"license": {
5+
"name": "CC0 1.0",
6+
"uri": "http://creativecommons.org/publicdomain/zero/1.0"
7+
},
48
"metadataBlocks": {
59
"citation": {
610
"fields": [
Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
{
2+
"datasetVersion": {
3+
"metadataBlocks": {
4+
"citation": {
5+
"fields": [
6+
{
7+
"value": "Darwin's Finches",
8+
"typeClass": "primitive",
9+
"multiple": false,
10+
"typeName": "title"
11+
},
12+
{
13+
"value": [
14+
{
15+
"authorName": {
16+
"value": "Finch, Fiona",
17+
"typeClass": "primitive",
18+
"multiple": false,
19+
"typeName": "authorName"
20+
},
21+
"authorAffiliation": {
22+
"value": "Birds Inc.",
23+
"typeClass": "primitive",
24+
"multiple": false,
25+
"typeName": "authorAffiliation"
26+
}
27+
}
28+
],
29+
"typeClass": "compound",
30+
"multiple": true,
31+
"typeName": "author"
32+
},
33+
{
34+
"value": [
35+
{ "datasetContactEmail" : {
36+
"typeClass": "primitive",
37+
"multiple": false,
38+
"typeName": "datasetContactEmail",
39+
"value" : "finch@mailinator.com"
40+
},
41+
"datasetContactName" : {
42+
"typeClass": "primitive",
43+
"multiple": false,
44+
"typeName": "datasetContactName",
45+
"value": "Finch, Fiona"
46+
}
47+
}],
48+
"typeClass": "compound",
49+
"multiple": true,
50+
"typeName": "datasetContact"
51+
},
52+
{
53+
"value": [ {
54+
"dsDescriptionValue":{
55+
"value": "Darwin's finches (also known as the Galápagos finches) are a group of about fifteen species of passerine birds.",
56+
"multiple":false,
57+
"typeClass": "primitive",
58+
"typeName": "dsDescriptionValue"
59+
}}],
60+
"typeClass": "compound",
61+
"multiple": true,
62+
"typeName": "dsDescription"
63+
},
64+
{
65+
"value": [
66+
"Medicine, Health and Life Sciences"
67+
],
68+
"typeClass": "controlledVocabulary",
69+
"multiple": true,
70+
"typeName": "subject"
71+
}
72+
],
73+
"displayName": "Citation Metadata"
74+
}
75+
}
76+
}
77+
}

scripts/search/tests/data/dataset-finch1.json

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,9 @@
11
{
22
"datasetVersion": {
3+
"license": {
4+
"name": "CC0 1.0",
5+
"uri": "http://creativecommons.org/publicdomain/zero/1.0"
6+
},
37
"metadataBlocks": {
48
"citation": {
59
"fields": [

scripts/search/tests/data/dataset-finch2.json

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,9 @@
11
{
22
"datasetVersion": {
3+
"license": {
4+
"name": "CC0 1.0",
5+
"uri": "http://creativecommons.org/publicdomain/zero/1.0"
6+
},
37
"metadataBlocks": {
48
"citation": {
59
"fields": [

src/main/java/edu/harvard/iq/dataverse/dataset/DatasetUtil.java

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -548,6 +548,10 @@ public static License getLicense(DatasetVersion dsv) {
548548

549549
public static String getLicenseName(DatasetVersion dsv) {
550550
License license = DatasetUtil.getLicense(dsv);
551+
return getLocalizedLicenseName(license);
552+
}
553+
554+
public static String getLocalizedLicenseName(License license) {
551555
return license != null ? getLocalizedLicenseDetails(license,"NAME")
552556
: BundleUtil.getStringFromBundle("license.custom");
553557
}

src/main/java/edu/harvard/iq/dataverse/engine/command/impl/PublishDatasetCommand.java

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@
2323
import static java.util.stream.Collectors.joining;
2424
import static edu.harvard.iq.dataverse.engine.command.impl.PublishDatasetResult.Status;
2525
import static edu.harvard.iq.dataverse.dataset.DatasetUtil.validateDatasetMetadataExternally;
26+
import edu.harvard.iq.dataverse.util.StringUtil;
2627

2728

2829
/**
@@ -204,6 +205,12 @@ private void verifyCommandArguments(CommandContext ctxt) throws IllegalCommandEx
204205
throw new IllegalCommandException("Only authenticated users can release a Dataset. Please authenticate and try again.", this);
205206
}
206207

208+
if (getDataset().getLatestVersion().getTermsOfUseAndAccess() == null
209+
|| (getDataset().getLatestVersion().getTermsOfUseAndAccess().getLicense() == null
210+
&& StringUtil.isEmpty(getDataset().getLatestVersion().getTermsOfUseAndAccess().getTermsOfUse()))) {
211+
throw new IllegalCommandException("Dataset must have a valid license or Custom Terms Of Use configured before it can be published.", this);
212+
}
213+
207214
if ( (getDataset().isLockedFor(DatasetLock.Reason.Workflow)&&!ctxt.permissions().isMatchingWorkflowLock(getDataset(),request.getUser().getIdentifier(),request.getWFInvocationId()))
208215
|| getDataset().isLockedFor(DatasetLock.Reason.Ingest)
209216
|| getDataset().isLockedFor(DatasetLock.Reason.finalizePublication)

0 commit comments

Comments
 (0)