Skip to content

Commit 65f6df9

Browse files
authored
NO JIRA. Misc fixes and refactoring (#23)
# Description of changes This PR contains a lot of fixes, refactorings and a couple of new features: ## Fixes * The documentations is made up-to-date with the actual funtionality and now is more similar in format to the other microservices in that it lists the exposed and consumed interfaces. * In particular the record is no set straight on the version info JSON files, which are required. * Validation of object import directories is now more strict: * version dirs must all have a corresonding version info JSON file which must be well-formed. * versions must be consecutive ## Refactorings * To avoid confusion between the object-version-properties extension and the `vN.json` version info JSON files, names and docs have been updated to make the use more consistent. * The object import directory validation is broken up into smaller methods for readability. ## New features * It is now possible to archive an old (closed) layer. This feature is for now only useful for recovery scenario where some manual steps outside the service are likely to be necessary. The advantage of having at least the archiving be done through the service, is that the consistency check is automatically triggered and the dmftar command is executed with the exact same parameters and environment as when a new top layer is created. ## Dropped feature * It was possible before to allow timestamps instead of version strings such as `v1` as version directory names. This feature was not used and has been dropped to reduce code complexity.
1 parent 32775c4 commit 65f6df9

26 files changed

Lines changed: 885 additions & 207 deletions

File tree

docs/dev.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,11 @@ Local testing
77
-------------
88
Local testing uses the same [set-up]{:target=_blank} as other DANS microservices.
99

10-
[set-up]: https://dans-knaw.github.io/dans-module-archetype/common-practices/#debugging
10+
### Creating Object Import Directories
1111

12+
If you want to create object import directories for testing, you can use the helper script
13+
`create-object-import-dir.py` from [dans-dev-scripts]{:target=_blank}.
14+
15+
[set-up]: {{ local_testing_setup }}
16+
[dans-dev-scripts]: {{ dans_dev_scripts_url }}
1217

docs/img/overview.graphml

Lines changed: 314 additions & 0 deletions
Large diffs are not rendered by default.

docs/img/overview.png

43.7 KB
Loading

docs/index.md

Lines changed: 76 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -5,65 +5,103 @@ Manages a DANS Data Vault Storage Root
55

66
Purpose
77
-------
8-
A DANS Data Vault Storage Root is an OCFL storage root that is used to store a collection of long term preservation objects.
8+
A DANS Data Vault Storage Root is an OCFL storage root used to store a collection of long-term preservation objects.
99

1010
Interfaces
1111
----------
12+
This service has the following interfaces:
1213

13-
### Batches and Object Import Directories
14+
![](img/overview.png){width="70%"}
1415

15-
Objects versions to be stored must be placed under the inbox in a batch directory. The layout of the batch directory is as follows:
16+
### Provided interfaces
17+
18+
#### Inbox
19+
20+
* _Protocol type_: Shared filesystem
21+
* _Internal or external_: **internal**
22+
* _Purpose_: to receive [Object Import Directories](#object-import-directories)
23+
24+
#### Command API
25+
26+
* _Protocol type_: HTTP
27+
* _Internal or external_: **internal**
28+
* _Purpose_: to manage the service including starting imports
29+
30+
#### Admin console
31+
32+
* _Protocol type_: HTTP
33+
* _Internal or external_: **internal**
34+
* _Purpose_: application monitoring and management
35+
36+
### Consumed interfaces
37+
38+
#### DMFTAR (optional)
39+
40+
* _Protocol type_: Local command invocation
41+
* _Internal or external_: **external**
42+
* _Purpose_: to create DMFTAR archives in the [SURF Data Archive]{:target=_blank}. This interface is optional because it is only used if the DMFTAR
43+
archive provider has been configured. (The other archive providers use Java code to create the archives and do not require an external interface.)
44+
45+
### Object Import Directories
46+
47+
Objects versions to be imported must be placed under the inbox in a batch directory. The layout of the batch directory is as follows:
1648

1749
```plaintext
1850
batch-dir
1951
├── urn:nbn:nl:ui:13-26febff0-4fd4-4ee7-8a96-b0703b96f812
2052
│ ├── v1
2153
│ │ └── <content files>
54+
│ ├── v1.json
2255
│ ├── v2
2356
│ │ └── <content files>
2457
│ ├── v2.json
25-
│ └── v3
26-
│ └── <content files>
58+
│ ├── v3
59+
│ │ └── <content files>
60+
│ └── v3.json
2761
├── urn:nbn:nl:ui:13-2ced2354-3a9d-44b1-a594-107b3af99789
28-
│ └── v3
29-
│ └── <content files>
62+
│ ├── v3
63+
│ │ └── <content files>
64+
│ └── v3.json
3065
└── urn:nbn:nl:ui:13-b7c0742f-a9b2-4c11-bffe-615dbe24c8a0
31-
└── v1
32-
└── <content files>
66+
├── v1
67+
│ └── <content files>
68+
└── v1.json
3369
```
3470

3571
* `batch-dir` - The batch directory is the directory where the batch of objects to be imported is placed.
36-
* `urn:nbn:nl:ui:13-26febff0-4fd4-4ee7-8a96-b0703b96f812` - The directory name is the identifier of the object. The pattern that an
37-
identifier must match can be configured in the configuration file.
38-
* `v1`, `v2`, `v3` - The version directories contain the content of the object version. The version directories must be named `v1`, `v2`, `v3`, etc.
39-
When updating an existing object, the first version directory must be named after the next version to be created in the OCFL object.
40-
* The service can also be configured to accept timestamps as version directories. In that case, the version directories are expected to be numbers,
41-
representing the timestamp of the version in milliseconds since the epoch. This timestamp is only used for ordering the versions in the OCFL object, so
42-
any number can be used as long as it is unique for the object. This option is mainly used for testing purposes.
43-
* A version directory must be accompanied by a JSON file named `vN.json`, where `N` is the version number (e.g. `v2.json` for
44-
version 2). This file is required for every version. It must have the following structure:
45-
46-
```json
47-
{
48-
"version-info": {
49-
"user": {
50-
"name": "John Doe",
51-
"email": "john.doe@mail.com"
52-
},
53-
"message": "Commit message"
72+
* `urn:nbn:nl:ui:13-26febff0-4fd4-4ee7-8a96-b0703b96f812` - The directory name is the identifier of the object in the OCFL Storage Root. The pattern that an
73+
identifier must match can be [configured]{:target=_blank}.
74+
* `v1`, `v2`, `v3` - The version directories contain the content of the object versions. The version directories must be named `v1`, `v2`, `v3`, etc.
75+
The first version directory must be named after the next version to be created in the OCFL object.
76+
* A version directory must be accompanied by a version info JSON file named `vN.json`, where `N` is the version number (e.g., `v2.json` for
77+
version 2). This version info JSON file is required for every version. It must have a structure as in the example below.
78+
79+
[configured]: {{ config_file_url }}
80+
81+
##### Example version info JSON file
82+
83+
```json
84+
{
85+
"version-info": {
86+
"user": {
87+
"name": "John Doe",
88+
"email": "john.doe@mail.com"
5489
},
55-
"object-version-properties": {
56-
"dataset-version": "1.2",
57-
"packaging-format": "DANS RDA BagPack/1.0.0"
58-
}
90+
"message": "Commit message"
91+
},
92+
"object-version-properties": {
93+
"dataset-version": "1.2",
94+
"packaging-format": "DANS RDA BagPack/1.0.0"
5995
}
60-
```
61-
62-
Requirements and notes:
63-
- The `version-info` object is mandatory and must include `user.name`, `user.email`, and `message`.
64-
- `version-info.user.email` may be specified with or without the `mailto:` prefix; the service will normalize it to `mailto:`.
65-
- The `object-version-properties` object is optional and may contain any custom properties to be stored for the object version. These are written to the
66-
Object Version Properties extension.
96+
}
97+
```
98+
99+
Requirements and notes:
100+
101+
- The `version-info` object is mandatory and must include `user.name`, `user.email`, and `message`.
102+
- `version-info.user.email` may be specified with or without the `mailto:` prefix; the service will normalize it to `mailto:`.
103+
- The `object-version-properties` object is optional and may contain any custom properties to be stored for the object version. These are written to the
104+
[Object Version Properties]{:target=_blank} extension.
67105

68106
[Object Version Properties]: {{ object_version_properties_ext }}
69107

mkdocs.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,9 @@ nav:
3535

3636
extra:
3737
object_version_properties_ext: https://dans-knaw.github.io/dans-ocfl-extensions/object-version-properties/object-version-properties/
38+
config_file_url: https://github.com/DANS-KNAW/dd-data-vault/blob/master/src/main/assembly/dist/cfg/config.yml
39+
dans_dev_scripts_url: https://github.com/DANS-KNAW/dans-dev-scripts
40+
local_testing_setup: https://dans-knaw.github.io/dans-datastation-architecture/dev-common-practices/#debugging
3841

3942
plugins:
4043
- markdownextradata

pom.xml

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@
2626
</parent>
2727

2828
<artifactId>dd-data-vault</artifactId>
29-
<version>3.4.1-SNAPSHOT</version>
29+
<version>4.0.0-SNAPSHOT</version>
3030

3131
<name>DD Data Vault</name>
3232
<url>https://github.com/DANS-KNAW/dd-data-vault</url>
@@ -37,6 +37,9 @@
3737
<!-- TODO: move to dd-parent -->
3838
<commons-validator.version>1.7</commons-validator.version>
3939
<dans-ocfl-extensions.version>1.0.0</dans-ocfl-extensions.version>
40+
<dans-ocfl-java-extensions-lib.version>1.1.0</dans-ocfl-java-extensions-lib.version>
41+
<dans-layer-store-lib.version>1.1.0</dans-layer-store-lib.version>
42+
<dd-data-vault-api.version>1.0.0</dd-data-vault-api.version>
4043
<main-class>nl.knaw.dans.datavault.DdDataVaultApplication</main-class>
4144
</properties>
4245

src/main/assembly/dist/cfg/config.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -159,4 +159,4 @@ logging:
159159
# Used in combination with journald, which already adds the timestamp
160160
logFormat: "%-5p [%t] %c{0}: %m%n%dwREx"
161161
loggers:
162-
'org.hibernate.engine.internal.StatisticalLoggingSessionEventListener': 'OFF'
162+
'org.hibernate.engine.internal.StatisticalLoggingSessionEventListener': 'OFF'

src/main/java/nl/knaw/dans/datavault/core/ImportJob.java

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -54,9 +54,6 @@ public enum Status {
5454
@Column
5555
private boolean singleObject;
5656

57-
@Column
58-
private boolean acceptTimestampVersionDirectories;
59-
6057
@Column(nullable = false)
6158
private OffsetDateTime created;
6259

0 commit comments

Comments
 (0)