This document describes the declarative properties/config files used by the DCAT-3 exporter. It covers root config, resource config (elements), scoping, value sources, and formatting.
Defines output format, prefixes, elements, and relations.
trace.enabled = false
# Prefixes for CURIEs used in configs
prefix.dcat = http://www.w3.org/ns/dcat#
prefix.dct = http://purl.org/dc/terms/
prefix.foaf = http://xmlns.com/foaf/0.1/
prefix.vcard = http://www.w3.org/2006/vcard/ns#
prefix.skos = http://www.w3.org/2004/02/skos/core#
prefix.rdfs = http://www.w3.org/2000/01/rdf-schema#
prefix.xsd = http://www.w3.org/2001/XMLSchema#
prefix.spdx = http://spdx.org/rdf/terms#
# Elements (each loads its own resource config file)
element.catalog.id = catalog
element.catalog.type = dcat:Catalog
element.catalog.file = dcat-catalog.properties
element.dataset.id = dataset
element.dataset.type = dcat:Dataset
element.dataset.file = dcat-dataset.properties
element.distribution.id = distribution
element.distribution.type = dcat:Distribution
element.distribution.file = dcat-distribution.properties
# Relations between element subjects
relation.catalog_has_dataset.subject = catalog
relation.catalog_has_dataset.predicate = dcat:dataset
relation.catalog_has_dataset.object = dataset
relation.catalog_has_dataset.cardinality = 0..n
relation.dataset_has_distribution.subject = dataset
relation.dataset_has_distribution.predicate = dcat:distribution
relation.dataset_has_distribution.object = distribution
relation.dataset_has_distribution.cardinality = 0..nThe trace option can be used to trace the internal data received from Dataverse so that proper JSON queries can be defined.
TIP: When exploring the structure of the traced JSON, you can use helpful external tools:
- https://jsonpathfinder.com/ — discover and navigate the nested path to a specific property.
- https://jsonpath.com/ — test and validate your JSONPath expressions against real trace output.
- This exporter provides DCAT serializations in RDF/XML, Turtle, and JSON‑LD.
- Dataverse harvesters only support XML formats, therefore only the RDF/XML variant is harvestable.
- For Turtle and JSON‑LD, the harvestable property is ignored and effectively overridden to false, regardless of its value in dcat-root.properties.
- The
availableToUsersflag only controls visibility in the Dataverse UI: when set to true, the format will appear in the Metadata → Export menu for manual export by users.
Example (effective behavior):
dcat.format.rdfXml.availableToUsers = true
dcat.format.rdfXml.harvestable = true
dcat.format.turtle.availableToUsers = true
dcat.format.turtle.harvestable = false # ignored/overridden
dcat.format.jsonLd.availableToUsers = true
dcat.format.jsonLd.harvestable = false # ignored/overridden
The relations describe which entities are relevant in the application profile. Each of the entities can have a file describing that entity.
Controls how to build a resource model (subjects, properties, nodes).
Use scope.json to iterate over parts of the input JSON:
# Iterate over each file
scope.json = $.datasetFileDetails[*]If you accidentally use
$.datasetFileDetails(no[*]), the mapper will auto-iterate the array.
Define the resource subject IRI:
# Mint an IRI per file id
subject.iri.json = $.id
subject.iri.format = https://dataverse.nl/distribution/${value}Each props.<id>.* block describes one property. Supported keys:
predicate– CURIE/IRI of the predicate (resolved against prefixes)as–literal|iri|node-reflang– language tag for literals (optional)datatype– datatype IRI (CURIE allowed), for typed literals (optional)json– JSONPath to read a value (supports$$for root lookup)json.N– indexed JSONPaths; allows composition informatusing${1},${2}, …const– constant valuemap.*– mapping table (e.g.,map.python = text/x-python)format– template to compose values. Supports:${value}– the base value (from the stream orjson)${1},${2}, … – fromjson.1,json.2, …- inline JSONPath placeholders:
${$.path}or${$$.path}
multi–trueto emit multiple values from a multi-match JSONPathnode– node id foras=node-ref(see nodes below)when– future conditional emission (reserved)onUnMappedValue– fallback value when input exists but doesn't match any map key (for literal properties)onNoInputValue– fallback value when no input is present from JSON path (for literal properties)
Examples
# Literal title taken from file name
props.title.predicate = dct:title
props.title.as = literal
props.title.lang = en
props.title.json = $.filename
# Typed literal (byte size)
props.byteSize.predicate = dcat:byteSize
props.byteSize.as = literal
props.byteSize.datatype = xsd:nonNegativeInteger
props.byteSize.json = $.filesize
# Media type literal
props.mediaType.predicate = dcat:mediaType
props.mediaType.as = literal
props.mediaType.json = $.contentType
# Dataset access URL read from the global root (not per file)
props.accessURL.predicate = dcat:accessURL
props.accessURL.as = iri
props.accessURL.json = $$.datasetJson.persistentUrl
# Email IRI using format
nodes.contact.props.email.predicate = vcard:hasEmail
nodes.contact.props.email.as = iri
nodes.contact.props.email.json = $..metadataBlocks.citation.fields[?(@.typeName=='datasetContact')].value[0].datasetContactEmail.value
nodes.contact.props.email.format = mailto:${value}
# Version composed from two JSON paths
props.hasVersion.predicate = dct:hasVersion
props.hasVersion.as = literal
props.hasVersion.json.1 = $$.datasetJson.datasetVersion.versionNumber
props.hasVersion.json.2 = $$.datasetJson.datasetVersion.versionMinorNumber
props.hasVersion.format = V${1}.${2}
# Alternate one-liner using inline JSONPaths
# props.hasVersion.format = V${$$.datasetJson.datasetVersion.versionNumber}.${$$.datasetJson.datasetVersion.versionMinorNumber}When working with literal properties that use mapping tables (map.*), you can provide fallback values for cases where:
- No input is present: Use
onNoInputValuewhen the JSON path returns no data - Input doesn't match mapping: Use
onUnMappedValuewhen input exists but doesn't match any key in the mapping table
# Status property with mapping and fallbacks
props.status.predicate = dct:status
props.status.as = literal
props.status.json = $.publicationState
props.status.map.published = published
props.status.map.draft = draft
props.status.onUnMappedValue = unknown
props.status.onNoInputValue = not specifiedBehavior:
- If
$.publicationStatecontains"published"→ emits"published" - If
$.publicationStatecontains"draft"→ emits"draft" - If
$.publicationStatecontains"archived"→ emits"unknown"(unmapped fallback) - If
$.publicationStateis missing/null → emits"not specified"(no input fallback)
For IRI nodes referenced via as=node-ref, similar fallback logic applies when the node uses mapping tables:
# Access rights node with fallbacks
nodes.accessRights.kind = iri
nodes.accessRights.type = dct:RightsStatement
nodes.accessRights.iri.json = $.accessLevel
nodes.accessRights.map.public = http://publications.europa.eu/resource/authority/access-right/PUBLIC
nodes.accessRights.map.restricted = http://publications.europa.eu/resource/authority/access-right/RESTRICTED
nodes.accessRights.onUnMappedValue = http://publications.europa.eu/resource/authority/access-right/NON_PUBLIC
nodes.accessRights.onNoInputValue = http://publications.europa.eu/resource/authority/access-right/PUBLIC
# Reference the node
props.accessRights.predicate = dct:accessRights
props.accessRights.as = node-ref
props.accessRights.node = accessRightsBehavior:
- If
$.accessLevelcontains"public"→ creates IRI nodehttp://publications.europa.eu/resource/authority/access-right/PUBLIC - If
$.accessLevelcontains"internal"→ creates IRI nodehttp://publications.europa.eu/resource/authority/access-right/NON_PUBLIC(unmapped fallback) - If
$.accessLevelis missing/null → creates IRI nodehttp://publications.europa.eu/resource/authority/access-right/PUBLIC(no input fallback)
Use nodes.<id>.* to describe embedded nodes for as=node-ref:
# checksum node
props.checksum.predicate = spdx:checksum
props.checksum.as = node-ref
props.checksum.node = checksum
nodes.checksum.kind = bnode # or "iri" with nodes.checksum.iri.const
nodes.checksum.type = spdx:Checksum
nodes.checksum.props.checksumValue.predicate = spdx:checksumValue
nodes.checksum.props.checksumValue.as = literal
nodes.checksum.props.checksumValue.json = $.checksum.value$...– evaluated against the current scope (e.g., the file object indatasetFileDetails[*]).$$...– evaluated against the original document root.
- RDF/XML requires absolute IRIs. Use
format(e.g.,mailto:${value}) to make email addresses valid IRIs. - Turtle will show typed literals with quotes (e.g.,
"4026"^^xsd:nonNegativeInteger). This is correct.
- If a JSONPath fails, enable tracing and check the scope you are in; ensure you use
$vs$$appropriately. - When linking elements (dataset → distribution), ensure the subjects are minted (absolute IRIs) and relations are applied after model merging.
The following validations are carried out:
- Empty prefix keys / invalid IRIs → ERROR
- Missing prefixes → WARNING
- Missing id, typeCurieOrIri, file → ERROR
- typeCurieOrIri not CURIE/IRI or unknown CURIE prefix → ERROR
- Missing subject/object/predicate → ERROR
- Predicate not CURIE/IRI or unknown prefix → ERROR
- No minting strategy at all (const/template/json) → WARNING
- iriFormat provided without template or json → ERROR
- Missing predicate → ERROR
- Bad as value → ERROR
- node-ref without nodeRef → ERROR
- No source (json|const|json.*|node) → WARNING
- Empty id → ERROR
- kind must be bnode or iri → ERROR
- type must be CURIE/IRI; check prefixes → ERROR
A dataset contains multiple files with different access restrictions:
- Some files are public (restricted=false)
- Some files are restricted (restricted=true)
Requirement: Dataset-level accessRights should reflect the most restrictive file access level.
This pattern demonstrates handling aggregation of file-level properties to dataset level without adding new DSL features. The key principle is that data governance decisions belong with administrators, not automation logic.
Each distribution's rights are automatically derived from its file's restricted flag:
# Each file's restricted boolean is directly mapped
nodes.rights.kind = iri
nodes.rights.type = dct:RightsStatement
nodes.rights.iri.json = $.restricted
nodes.rights.map.true = http://publications.europa.eu/resource/authority/access-right/RESTRICTED
nodes.rights.map.false = http://publications.europa.eu/resource/authority/access-right/PUBLIC
props.rights.predicate = dct:rights
props.rights.as = node-ref
props.rights.node = rightsThe dataset's accessRights are set through administrator configuration via metadata:
# Dataset rights are configured by administrator via metadata field
# Admin must set DCATaccessRights to match the most restrictive file level
nodes.ar.kind = iri
nodes.ar.type = dct:RightsStatement
nodes.ar.iri.json = $..DCATaccessRights
nodes.ar.map.public = http://publications.europa.eu/resource/authority/access-right/PUBLIC
nodes.ar.map.restricted = http://publications.europa.eu/resource/authority/access-right/RESTRICTED
nodes.ar.map.non-public = http://publications.europa.eu/resource/authority/access-right/NON_PUBLIC
props.accessRights.predicate = dct:accessRights
props.accessRights.as = node-ref
props.accessRights.node = ar- Distribution-level rights are automatic: Each file's
restrictedboolean determines its distribution's rights - Dataset-level rights are manual: Administrator/curator explicitly configures metadata to reflect governance policy
- Each can differ appropriately: DCAT-AP-NL 3.0 treats Dataset and Distribution as separate concerns
- Mapping system stays focused: Configuration-driven mapping, not business logic
- Publish dataset with files (some may be access-restricted)
- Set
DCATaccessRightsmetadata field:- If any file is restricted → select "restricted"
- If all files are public → select "public"
- Export to DCAT
- Mapping reads configured metadata and outputs it
- Each distribution has its own rights (per file)
- Dataset has aggregate rights (per admin configuration)
✅ DCAT-AP-NL 3.0 Compliant: Treats Dataset and Distribution as separate concerns
✅ Governance Best Practice: Access policy decisions belong with data stewards, not automation
✅ Mapping System Simple: Remains focused on configuration, not business logic
✅ Extensible: Organizations can later add pre-processing at Dataverse level if needed
See test case: src/test/java/io/gdcc/spi/export/dcat3/Issue49DatasetAccessRightsTest.java
See detailed explanation: ISSUE_49_SOLUTION.md
This mechanism is designed to be declarative, composable, and profile-friendly for DCAT/DCAT‑AP exports.
When adding a new Application Profile (AP), such as DCAT‑AP‑DE, DCAT‑AP‑NO or an organisation‑specific profile, place your mapping files and test fixtures in:
application_profiles/
<profile_name>/
mapping/
dcat-root.properties
dcat-dataset.properties
dcat-distribution.properties
...
README.md # purpose, scope, external spec links
src/test/resources/application_profiles/
<profile_name>/input/ # export_data_source_*.json fixtures
<profile_name>/expected/ # optional expected RDF outputs (not required) and order in RDF is non deterministic
You can add a testcase just like is done for the NL profile.
- Unit tests use local JSON fixtures combined with the mapping files in the profile.
- Integration tests load the Application Profile via the JVM system property:
-Ddataverse.dcat3.config=/path/to/profile/mapping/dcat-root.properties
- Keeps all Application Profiles self‑contained.
- Allows multiple national/organisational profiles to coexist without clashes.
- Ensures test‑only data does not pollute production mappings.