Skip to content

Mandatory vs optional metadata & validation behaviour (engine + profiles) #8

@sjaakd

Description

@sjaakd

This issue tracks how the exporter handles mandatory vs optional metadata across:

  • mapping configuration (.properties),
  • runtime input data from Dataverse,
  • and validation profiles (e.g. DCAT-AP-NL vs local/lightweight SHACL).

✅ Progress / implemented (engine hygiene)

We implemented engine-level guards to avoid producing empty RDF resources when source values are empty/null/placeholder-like:

  • For node-ref where the target node is kind=iri, the node is only materialised when a valid non-empty IRI can be resolved (no blank-node fallback).
  • For node-ref where the target node is kind=bnode, the node is only emitted when at least one nested property produces a value; typed-only empty bnodes are suppressed.

This prevents outputs such as <eli:LegalResource/>, <skos:Concept/>, etc. in RDF/XML when JSONPaths return empty values. (This also covers the problem described in #34.)

✅ Validation status

  • Official DCAT-AP-NL 3.0 compliance validation still succeeds.

🧩 Remaining work / design decisions

  1. Define where "mandatory" is enforced

    • Standard/profile mandatory (DCAT-AP-NL, etc.) vs local policy mandatory vs Dataverse provisioning reality.
  2. Runtime validation should not block exports
    Missing values can occur due to Dataverse configuration/provisioning or workflow. Export should continue and emit a warning report instead of aborting.

  3. Optional enhancement: declare cardinality in mapping
    Consider adding optional mapping metadata (e.g. min/max cardinality per property/node) to drive warnings and documentation, without forcing hard errors by default.

Proposed approach

  • Keep configuration validation as hard errors (invalid mapping files should abort export).
  • Treat runtime data validation as warnings (non-blocking), with an optional "strict mode" to fail in CI if desired.

Important: keep exporter generic

The exporter must remain reusable for other organisations. Therefore:

  • engine-level behaviour focuses on RDF hygiene (e.g., no empty typed nodes, no node-ref targets without resolvable IRIs),
  • but mandatory/optional expectations are treated as profile/policy, delivered via configurable shapes and/or mapping metadata (e.g., profiles/orgx/...), and validated as warnings by default (with optional strict mode).

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions