Skip to content

Commit bfe59f0

Browse files
authored
Merge pull request #11439 from IQSS/11392-edit-file-metadata-empty-values
File Metadata Update - Empty values clear fields
2 parents aa450e6 + c465180 commit bfe59f0

14 files changed

Lines changed: 221 additions & 86 deletions

File tree

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
### Edit Dataset Metadata API extension
2+
3+
- This endpoint now allows removing fields (by sending empty values), as long as they are not required by the dataset.
4+
- New ``sourceLastUpdateTime`` optional query parameter, which prevents inconsistencies by managing updates that
5+
may occur from other users while a dataset is being edited.
6+
7+
NOTE: This release note was updated to conform to the refactoring of the validation as part of issue #11392
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
### Edit File Metadata empty values should clear data
2+
3+
Previously the API POST /files/{id}/metadata would ignore fields with empty values. Now the API updates the fields with the empty values essentially clearing the data. Missing fields will still be ignored.
4+
5+
An optional query parameter (sourceLastUpdateTime) was added to ensure the metadata update doesn't overwrite stale data.
6+
7+
See also [the guides](https://dataverse-guide--11359.org.readthedocs.build/en/11359/api/native-api.html#updating-file-metadata), #11392, and #11359.

doc/sphinx-guides/source/api/changelog.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,11 @@ This API changelog is experimental and we would love feedback on its usefulness.
77
:local:
88
:depth: 1
99

10+
v6.8
11+
----
12+
- For POST /api/files/{id}/metadata passing an empty string ("description":"") or array ("categories":[]) will no longer be ignored. Empty fields will now clear out the values in the file's metadata. To ignore the fields simply do not include them in the JSON string.
13+
- For PUT /api/datasets/{id}/editMetadata the query parameter "sourceInternalVersionNumber" has been removed and replaced with "sourceLastUpdateTime" to verify that the data being edited hasn't been modified and isn't stale.
14+
1015
v6.7
1116
----
1217

doc/sphinx-guides/source/api/native-api.rst

Lines changed: 12 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -2156,26 +2156,26 @@ For these edits your JSON file need only include those dataset fields which you
21562156

21572157
This endpoint also allows removing fields, as long as they are not required by the dataset. To remove a field, send an empty value (``""``) for individual fields. For multiple fields, send an empty array (``[]``). A sample JSON file for removing fields may be downloaded here: :download:`dataset-edit-metadata-delete-fields-sample.json <../_static/api/dataset-edit-metadata-delete-fields-sample.json>`
21582158

2159-
If another user updates the dataset version metadata before you send the update request, data inconsistencies may occur. To prevent this, you can use the optional ``sourceInternalVersionNumber`` query parameter. This parameter must include the internal version number corresponding to the dataset version being updated. Note that internal version numbers increase sequentially with each version update.
2159+
If another user updates the dataset version metadata before you send the update request, metadata inconsistencies may occur. To prevent this, you can use the optional ``sourceLastUpdateTime`` query parameter. This parameter must include the ``lastUpdateTime`` corresponding to the dataset version being updated. The date must be in the format ``yyyy-MM-dd'T'HH:mm:ss'Z'``.
21602160

2161-
If this parameter is provided, the update will proceed only if the internal version number remains unchanged. Otherwise, the request will fail with an error.
2161+
If this parameter is provided, the update will proceed only if the ``lastUpdateTime`` remains unchanged (meaning no one has updated the dataset metadata since you retrieved it). Otherwise, the request will fail with an error.
21622162

2163-
Example using ``sourceInternalVersionNumber``:
2163+
Example using ``sourceLastUpdateTime``:
21642164

21652165
.. code-block:: bash
21662166
21672167
export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
21682168
export SERVER_URL=https://demo.dataverse.org
21692169
export PERSISTENT_IDENTIFIER=doi:10.5072/FK2/BCCP9Z
2170-
export SOURCE_INTERNAL_VERSION_NUMBER=5
2170+
export SOURCE_LAST_UPDATE_TIME=2025-04-25T13:58:28Z
21712171
2172-
curl -H "X-Dataverse-key: $API_TOKEN" -X PUT "$SERVER_URL/api/datasets/:persistentId/editMetadata?persistentId=$PERSISTENT_IDENTIFIER&replace=true&sourceInternalVersionNumber=$SOURCE_INTERNAL_VERSION_NUMBER" --upload-file dataset-update-metadata.json
2172+
curl -H "X-Dataverse-key: $API_TOKEN" -X PUT "$SERVER_URL/api/datasets/:persistentId/editMetadata?persistentId=$PERSISTENT_IDENTIFIER&replace=true&sourceLastUpdateTime=SOURCE_LAST_UPDATE_TIME" --upload-file dataset-update-metadata.json
21732173
21742174
The fully expanded example above (without environment variables) looks like this:
21752175

21762176
.. code-block:: bash
21772177
2178-
curl -H "X-Dataverse-key: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -X PUT "https://demo.dataverse.org/api/datasets/:persistentId/editMetadata/?persistentId=doi:10.5072/FK2/BCCP9Z&replace=true&sourceInternalVersionNumber=5" --upload-file dataset-update-metadata.json
2178+
curl -H "X-Dataverse-key: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -X PUT "https://demo.dataverse.org/api/datasets/:persistentId/editMetadata/?persistentId=doi:10.5072/FK2/BCCP9Z&replace=true&sourceLastUpdateTime=2025-04-25T13:58:28Z" --upload-file dataset-update-metadata.json
21792179
21802180
21812181
Delete Dataset Metadata
@@ -4730,6 +4730,8 @@ Updating File Metadata
47304730
47314731
Updates the file metadata for an existing file where ``ID`` is the database id of the file to update or ``PERSISTENT_ID`` is the persistent id (DOI or Handle) of the file. Requires a ``jsonString`` expressing the new metadata. No metadata from the previous version of this file will be persisted, so if you want to update a specific field first get the json with the above command and alter the fields you want.
47324732
4733+
An optional parameter, sourceLastUpdateTime=datetime (in format: ``yyyy-MM-dd'T'HH:mm:ss'Z'``), can be used to verify that the file metadata being edited has not been changed since you last retrieved it, thereby avoiding potential lost metadata updates. The value for sourceLastUpdateTime can be taken from ``lastUpdateTime`` in the response to get $SERVER_URL/api/files/$ID API call.
4734+
47334735
A curl example using an ``ID``
47344736
47354737
.. code-block:: bash
@@ -4750,25 +4752,26 @@ The fully expanded example above (without environment variables) looks like this
47504752
-F 'jsonData={"description":"My description bbb.","provFreeform":"Test prov freeform","categories":["Data"],"dataFileTags":["Survey"],"restrict":false}' \
47514753
"https://demo.dataverse.org/api/files/24/metadata"
47524754
4753-
A curl example using a ``PERSISTENT_ID``
4755+
A curl example using a ``PERSISTENT_ID`` and the sourceLastUpdateTime parameter:
47544756
47554757
.. code-block:: bash
47564758
47574759
export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
47584760
export SERVER_URL=https://demo.dataverse.org
47594761
export PERSISTENT_ID=doi:10.5072/FK2/AAA000
4762+
export UPDATE_TIME=2025-04-25T13:58:28Z
47604763
47614764
curl -H "X-Dataverse-key:$API_TOKEN" -X POST \
47624765
-F 'jsonData={"description":"My description bbb.","provFreeform":"Test prov freeform","categories":["Data"],"dataFileTags":["Survey"],"restrict":false}' \
4763-
"$SERVER_URL/api/files/:persistentId/metadata?persistentId=$PERSISTENT_ID"
4766+
"$SERVER_URL/api/files/:persistentId/metadata?persistentId=$PERSISTENT_ID&sourceLastUpdateTime=$UPDATE_TIME"
47644767
47654768
The fully expanded example above (without environment variables) looks like this:
47664769
47674770
.. code-block:: bash
47684771
47694772
curl -H "X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -X POST \
47704773
-F 'jsonData={"description":"My description bbb.","provFreeform":"Test prov freeform","categories":["Data"],"dataFileTags":["Survey"],"restrict":false}' \
4771-
"https://demo.dataverse.org/api/files/:persistentId/metadata?persistentId=doi:10.5072/FK2/AAA000"
4774+
"https://demo.dataverse.org/api/files/:persistentId/metadata?persistentId=doi:10.5072/FK2/AAA000&sourceLastUpdateTime=2025-04-25T13:58:28Z"
47724775
47734776
Note: To update the 'tabularTags' property of file metadata, use the 'dataFileTags' key when making API requests. This property is used to update the 'tabularTags' of the file metadata.
47744777

src/main/java/edu/harvard/iq/dataverse/api/AbstractApiBean.java

Lines changed: 17 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@
2828
import edu.harvard.iq.dataverse.search.savedsearch.SavedSearchServiceBean;
2929
import edu.harvard.iq.dataverse.settings.SettingsServiceBean;
3030
import edu.harvard.iq.dataverse.util.BundleUtil;
31+
import edu.harvard.iq.dataverse.util.DateUtil;
3132
import edu.harvard.iq.dataverse.util.FileUtil;
3233
import edu.harvard.iq.dataverse.util.SystemConfig;
3334
import edu.harvard.iq.dataverse.util.json.JsonParser;
@@ -52,6 +53,7 @@
5253

5354
import java.io.InputStream;
5455
import java.net.URI;
56+
import java.time.Instant;
5557
import java.util.*;
5658
import java.util.concurrent.Callable;
5759
import java.util.logging.Level;
@@ -447,10 +449,22 @@ public Command<DatasetVersion> handleLatestPublished() {
447449
return dsv;
448450
}
449451

450-
protected void validateInternalVersionNumberIsNotOutdated(Dataset dataset, int internalVersion) throws WrappedResponse {
451-
if (dataset.getLatestVersion().getVersion() > internalVersion) {
452+
protected void validateInternalTimestampIsNotOutdated(DvObject dvObject, String sourceLastUpdateTime) throws WrappedResponse {
453+
Date date = sourceLastUpdateTime != null ? DateUtil.parseDate(sourceLastUpdateTime, "yyyy-MM-dd'T'HH:mm:ss'Z'") : null;
454+
if (date == null) {
452455
throw new WrappedResponse(
453-
badRequest(BundleUtil.getStringFromBundle("abstractApiBean.error.datasetInternalVersionNumberIsOutdated", Collections.singletonList(Integer.toString(internalVersion))))
456+
badRequest(BundleUtil.getStringFromBundle("jsonparser.error.parsing.date", Collections.singletonList(sourceLastUpdateTime)))
457+
);
458+
}
459+
Instant instant = date.toInstant();
460+
Instant updateTimestamp =
461+
(dvObject instanceof DataFile) ? ((DataFile) dvObject).getFileMetadata().getDatasetVersion().getLastUpdateTime().toInstant() :
462+
(dvObject instanceof Dataset) ? ((Dataset) dvObject).getLatestVersion().getLastUpdateTime().toInstant() :
463+
instant;
464+
// granularity is to the second since the json output only returns dates in this format to the second
465+
if (updateTimestamp.getEpochSecond() != instant.getEpochSecond()) {
466+
throw new WrappedResponse(
467+
badRequest(BundleUtil.getStringFromBundle("abstractApiBean.error.internalVersionTimestampIsOutdated", Collections.singletonList(sourceLastUpdateTime)))
454468
);
455469
}
456470
}

src/main/java/edu/harvard/iq/dataverse/api/Datasets.java

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1118,12 +1118,14 @@ private String getCompoundDisplayValue (DatasetFieldCompoundValue dscv){
11181118
@PUT
11191119
@AuthRequired
11201120
@Path("{id}/editMetadata")
1121-
public Response editVersionMetadata(@Context ContainerRequestContext crc, String jsonBody, @PathParam("id") String id, @QueryParam("replace") boolean replaceData, @QueryParam("sourceInternalVersionNumber") Integer sourceInternalVersionNumber) {
1121+
public Response editVersionMetadata(@Context ContainerRequestContext crc, String jsonBody, @PathParam("id") String id,
1122+
@QueryParam("replace") boolean replaceData,
1123+
@QueryParam("sourceLastUpdateTime") String sourceLastUpdateTime) {
11221124
try {
11231125
Dataset dataset = findDatasetOrDie(id);
11241126

1125-
if (sourceInternalVersionNumber != null) {
1126-
validateInternalVersionNumberIsNotOutdated(dataset, sourceInternalVersionNumber);
1127+
if (sourceLastUpdateTime != null) {
1128+
validateInternalTimestampIsNotOutdated(dataset, sourceLastUpdateTime);
11271129
}
11281130

11291131
JsonObject json = JsonUtil.getJsonObject(jsonBody);

src/main/java/edu/harvard/iq/dataverse/api/Files.java

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -410,8 +410,7 @@ public Response deleteFileInDataset(@Context ContainerRequestContext crc, @PathP
410410
@AuthRequired
411411
@Path("{id}/metadata")
412412
public Response updateFileMetadata(@Context ContainerRequestContext crc, @FormDataParam("jsonData") String jsonData,
413-
@PathParam("id") String fileIdOrPersistentId
414-
) throws DataFileTagException, CommandException {
413+
@PathParam("id") String fileIdOrPersistentId, @QueryParam("sourceLastUpdateTime") String sourceLastUpdateTime) {
415414

416415
FileMetadata upFmd = null;
417416

@@ -429,6 +428,13 @@ public Response updateFileMetadata(@Context ContainerRequestContext crc, @FormDa
429428
return error(BAD_REQUEST, "Error attempting get the requested data file.");
430429
}
431430

431+
if (sourceLastUpdateTime != null) {
432+
try {
433+
validateInternalTimestampIsNotOutdated(df, sourceLastUpdateTime);
434+
} catch (WrappedResponse wr) {
435+
return wr.getResponse();
436+
}
437+
}
432438

433439
//You shouldn't be trying to edit a datafile that has been replaced
434440
List<Long> result = em.createNamedQuery("DataFile.findDataFileThatReplacedId", Long.class)
@@ -519,7 +525,7 @@ public Response updateFileMetadata(@Context ContainerRequestContext crc, @FormDa
519525
return error(Response.Status.INTERNAL_SERVER_ERROR, "Error adding metadata to DataFile: " + e);
520526
}
521527

522-
} catch (WrappedResponse wr) {
528+
} catch (CommandException | WrappedResponse ex) {
523529
return error(BAD_REQUEST, "An error has occurred attempting to update the requested DataFile, likely related to permissions.");
524530
}
525531

src/main/java/edu/harvard/iq/dataverse/datasetutility/OptionalFileParams.java

Lines changed: 19 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -194,46 +194,28 @@ public boolean getTabIngest() {
194194
return this.tabIngest;
195195
}
196196

197-
public boolean hasCategories(){
198-
if ((categories == null)||(this.categories.isEmpty())){
199-
return false;
200-
}
201-
return true;
197+
public boolean hasCategories() {
198+
return categories != null;
202199
}
203200

204-
public boolean hasFileDataTags(){
205-
if ((dataFileTags == null)||(this.dataFileTags.isEmpty())){
206-
return false;
207-
}
208-
return true;
201+
public boolean hasFileDataTags() {
202+
return dataFileTags != null;
209203
}
210204

211205
public boolean hasDescription(){
212-
if ((description == null)||(this.description.isEmpty())){
213-
return false;
214-
}
215-
return true;
206+
return description != null;
216207
}
217208

218-
public boolean hasDirectoryLabel(){
219-
if ((directoryLabel == null)||(this.directoryLabel.isEmpty())){
220-
return false;
221-
}
222-
return true;
209+
public boolean hasDirectoryLabel() {
210+
return directoryLabel != null;
223211
}
224212

225-
public boolean hasLabel(){
226-
if ((label == null)||(this.label.isEmpty())){
227-
return false;
228-
}
229-
return true;
213+
public boolean hasLabel() {
214+
return label != null;
230215
}
231216

232-
public boolean hasProvFreeform(){
233-
if ((provFreeForm == null)||(this.provFreeForm.isEmpty())){
234-
return false;
235-
}
236-
return true;
217+
public boolean hasProvFreeform() {
218+
return provFreeForm != null;
237219
}
238220

239221
public boolean hasStorageIdentifier() {
@@ -245,15 +227,15 @@ public String getStorageIdentifier() {
245227
}
246228

247229
public boolean hasFileName() {
248-
return ((fileName!=null)&&(!fileName.isEmpty()));
230+
return fileName != null;
249231
}
250232

251233
public String getFileName() {
252234
return fileName;
253235
}
254236

255237
public boolean hasMimetype() {
256-
return ((mimeType!=null)&&(!mimeType.isEmpty()));
238+
return mimeType != null;
257239
}
258240

259241
public String getMimeType() {
@@ -266,7 +248,7 @@ public void setCheckSum(String checkSum, ChecksumType type) {
266248
}
267249

268250
public boolean hasCheckSum() {
269-
return ((checkSumValue!=null)&&(!checkSumValue.isEmpty()));
251+
return checkSumValue != null;
270252
}
271253

272254
public String getCheckSum() {
@@ -294,15 +276,10 @@ public void setFileSize(long fileSize) {
294276
* @param tags
295277
*/
296278
public void setCategories(List<String> newCategories) {
297-
298279
if (newCategories != null) {
299280
newCategories = Util.removeDuplicatesNullsEmptyStrings(newCategories);
300-
if (newCategories.isEmpty()) {
301-
newCategories = null;
302-
}
281+
this.categories = newCategories;
303282
}
304-
305-
this.categories = newCategories;
306283
}
307284

308285
/**
@@ -495,27 +472,20 @@ private void addFileDataTags(List<String> potentialTags) throws DataFileTagExcep
495472
}
496473

497474
potentialTags = Util.removeDuplicatesNullsEmptyStrings(potentialTags);
498-
499-
if (potentialTags.isEmpty()){
500-
return;
501-
}
502-
475+
503476
// Make a new list
504-
this.dataFileTags = new ArrayList<>();
477+
List<String> newList = new ArrayList<>();
505478

506479
// Add valid potential tags to the list
507480
for (String tagToCheck : potentialTags){
508481
if (DataFileTag.isDataFileTag(tagToCheck)){
509-
this.dataFileTags.add(tagToCheck);
482+
newList.add(tagToCheck);
510483
}else{
511484
String errMsg = BundleUtil.getStringFromBundle("file.addreplace.error.invalid_datafile_tag");
512485
throw new DataFileTagException(errMsg + " [" + tagToCheck + "]. Please use one of the following: " + DataFileTag.getListofLabelsAsString());
513486
}
514487
}
515-
// Shouldn't happen....
516-
if (dataFileTags.isEmpty()){
517-
dataFileTags = null;
518-
}
488+
this.dataFileTags = newList;
519489
}
520490

521491
private void msg(String s){

src/main/java/edu/harvard/iq/dataverse/util/json/JsonPrinter.java

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -906,7 +906,8 @@ public static JsonObjectBuilder json(DataFile df, FileMetadata fileMetadata, boo
906906
.add("tabularData", df.isTabularData())
907907
.add("tabularTags", getTabularFileTags(df))
908908
.add("creationDate", df.getCreateDateFormattedYYYYMMDD())
909-
.add("publicationDate", df.getPublicationDateFormattedYYYYMMDD());
909+
.add("publicationDate", df.getPublicationDateFormattedYYYYMMDD())
910+
.add("lastUpdateTime", format(fileMetadata.getDatasetVersion().getLastUpdateTime()));
910911
Dataset dfOwner = df.getOwner();
911912
if (dfOwner != null) {
912913
builder.add("fileAccessRequest", dfOwner.isFileAccessRequest());

src/main/java/propertyFiles/Bundle.properties

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3220,7 +3220,7 @@ datasetFieldValidator.error.emptyRequiredSingleValueForField=Empty required valu
32203220
updateDatasetFieldsCommand.api.processDatasetUpdate.parseError=Error parsing dataset update: {0}
32213221

32223222
#AbstractApiBean.java
3223-
abstractApiBean.error.datasetInternalVersionNumberIsOutdated=Dataset internal version number {0} is outdated
3223+
abstractApiBean.error.internalVersionTimestampIsOutdated=Internal version timestamp {0} is outdated
32243224

32253225
#RoleAssigneeServiceBean.java
32263226
roleAssigneeServiceBean.error.dataverseRequestCannotBeNull=DataverseRequest cannot be null.

0 commit comments

Comments
 (0)