The cdi-viewer now supports full SPARQL-based SHACL validation via shacl-engine, including sh:SPARQLTarget and sh:SPARQLConstraint features. This means CDIF Discovery shapes can use SPARQL-based targeting for hierarchical node selection without conversion.
Historical Note: The conversion patterns documented below are preserved for reference, showing how to adapt SPARQL-based shapes for Core SHACL-only validators. With shacl-engine's SPARQL support, these conversions are no longer necessary for this viewer. They remain useful for:
- Understanding Core SHACL alternatives
- Supporting environments without SPARQL engines
- Educational purposes
- Open the viewer at https://libis.github.io/cdi-viewer/
- Load your CDIF Discovery metadata (JSON-LD format)
- Select or load your CDIF Discovery SHACL shapes
- Click "Validate" to see validation results
- SPARQL targeting: Full support for
sh:SPARQLTarget - SPARQL constraints: Full support for
sh:SPARQLConstraint - Hierarchical validation: Validate nested structures with SPARQL queries
- Standard SHACL: All Core SHACL features also supported
Validation Engine: shacl-engine
- Includes SPARQL query engine
- Supports complex SPARQL-based constraints
- Browser-compatible (1.2MB bundle size)
Bundle Size Impact:
- Total bundle: 1.2MB (includes SPARQL support)
- Trade-off: Larger bundle for complete SHACL feature support
- Worth it: Enables validation of complex metadata structures
Test with CDIF Discovery metadata files from examples/cdi/ directory:
se_na2so4-XDI-CDI-CDIF.jsonld- X-ray spectroscopy datasetFeXAS_Fe_c3d.001-NEXUS-HDF5-cdi-CDIF.jsonld- NEXUS HDF5 dataset
Both use schema:Dataset as root type and work with SPARQL-based CDIF Discovery shapes.
Previous versions required converting SPARQL-based shapes to Core SHACL. This is no longer necessary - use your original SPARQL-based shapes directly.
-
N3.js (RDF parsing): ~150KB
-
jsonld.js (JSON-LD processing): ~130KB
-
Previous setup with Comunica: ~2.3MB total
- Comunica QueryEngine: 1.9MB (for
sh:SPARQLTargetsupport only) - Plus all the Core SHACL libraries above
- Still no
sh:SPARQLConstraintvalidation support
- Comunica QueryEngine: 1.9MB (for
What we tried:
- ✅ Comunica can execute SPARQL queries to find nodes matching
sh:SPARQLTarget - ❌ Comunica cannot validate SHACL constraints
- ❌ rdf-validate-shacl (the JavaScript SHACL validator) does not support
sh:SPARQLConstraint - ❌ No other JavaScript library found that validates SPARQL-based SHACL constraints
Adding Comunica for sh:SPARQLTarget support would increase the download size by 5-6x, significantly slowing down the page load for all users, just to support hierarchical node targeting that Core SHACL can approximate.
Technical reality:
- SPARQL engines are complex (query parsing, optimization, execution)
- Comunica (the leading JavaScript SPARQL engine) is 1.9MB minified
- SHACL validation with SPARQL constraints requires a different tool
- Most SHACL shape files (including DDI-CDI Official) use Core SHACL only
- Core SHACL provides sufficient expressiveness for validation in most cases
Our decision: We removed SPARQL support to keep the previewer fast and lightweight. The 1.9MB cost for hierarchical node selection isn't justified when Core SHACL alternatives work well for real-world use cases.
When converting SPARQL-based shapes to Core SHACL, use these patterns:
Instead of:
sh:target [
a sh:SPARQLTarget ;
sh:select """
PREFIX schema: <http://schema.org/>
SELECT DISTINCT ?this WHERE {
?this a schema:Dataset .
}
""" ;
]Use:
sh:targetClass schema:Dataset ;This is simpler, more efficient, and functionally equivalent for most cases.
Instead of:
sh:sparql [
a sh:SPARQLConstraint ;
sh:select """
SELECT $this WHERE {
$this schema:creator ?list .
?list rdf:rest*/rdf:first ?item .
FILTER NOT EXISTS { ?item a ?type }
}
""" ;
]Use:
# Validate RDF list structure recursively
ex:RDFListOfAgentsShape
a sh:NodeShape ;
sh:targetClass rdf:List ;
sh:property [
sh:path rdf:first ;
sh:or (
[ sh:class schema:Person ]
[ sh:class schema:Organization ]
) ;
] ;
sh:property [
sh:path rdf:rest ;
sh:or (
[ sh:hasValue rdf:nil ] # End of list
[ sh:node ex:RDFListOfAgentsShape ] # Continue recursively
)
] .This Core SHACL pattern validates lists of any length and works in both browser and server environments.
We created cdif-core.ttl as a browser-compatible alternative to the SPARQL-based CDIF Discovery shapes. This file is available in the previewer as the "CDIF Discovery Core" option.
Original CDIF Discovery shapes (rules.shacl):
- Used
sh:SPARQLTargetto select nodes hierarchically - 2 shapes:
CDIFDatasetMandatoryShapeandCDIFMetaMetadataShape - 4 mandatory properties:
identifier,name,licenseorconditionsOfAccess,dateModified
Our Core SHACL version (previewers/betatest/shapes/cdif-core.ttl):
- Converted
sh:SPARQLTargettosh:targetClass schema:Datasetandsh:targetSubjectsOf schema:about - Added
CDIFDatasetRecommendedShapewith 16 additional properties - Total: 20 properties validated:
- 4 mandatory (severity: Violation):
identifier,name,license/conditionsOfAccess,dateModified - 16 recommended (severity: Warning):
url,description,contributor,creator,keywords,distribution,measurementTechnique,variableMeasured,subjectOf,startDate,location,mainEntity,additionalProperty,relatedLink,additionalType,email
- 4 mandatory (severity: Violation):
-
Namespace correction: Used
http://schema.org/(nothttps://)- schema.org's canonical namespace uses http:// protocol
- This fixed property recognition in the UI (properties now show as "SHACL-defined" instead of "EXTRA")
- All example files updated to use consistent http:// namespace
-
Property classification bug fix: Fixed array context handling in
cdi-shacl-helpers.js- Problem: Code only checked
context[prefix]directly, which failed when@contextis an array - Solution: Iterate through array contexts to find prefix mappings
- Result: Properties now correctly classified with blue badges (SHACL-defined) vs yellow badges (EXTRA)
- Problem: Code only checked
Benefits of Core SHACL approach:
- ✅ Fast loading: ~400KB vs 2.3MB (5-6x smaller)
- ✅ Enhanced coverage: Expanded from 4 to 20 properties
- ✅ Browser compatibility: Works everywhere without heavyweight dependencies
- ✅ Maintainability: Simple, readable SHACL patterns
- ✅ Validation quality: Same mandatory property checking
Limitations compared to SPARQL approach:
- Direct class targeting (
sh:targetClass schema:Dataset) instead of hierarchical selection - Dataset subclasses (e.g.,
schema:MedicalDataset) would need explicit shapes - In practice: This rarely matters since most files use
schema:Datasetdirectly
Bottom line: The Core SHACL version provides equivalent validation for real-world use cases while being dramatically faster to load.
The CDI previewer provides four shape selection options:
-
DDI-CDI Official (Default) - Full DDI-CDI 1.0 shapes from ddi-cdi.github.io
- 300+ types covered
- Core SHACL only (no SPARQL)
- Comprehensive validation
-
CDIF Discovery Core - Browser-compatible CDIF Discovery shapes
- 20 schema.org properties (4 mandatory + 16 recommended)
- Converted from SPARQL-based shapes
- Lightweight and fast
-
Local Fallback - Embedded backup shapes
- Used if online shapes fail to load
- Core SHACL only
-
Custom URL - Load shapes from any URL
- Must use Core SHACL only
- SPARQL features will not work
Problem: When using CDIF Discovery Core shapes, all properties were marked as "EXTRA" instead of being recognized.
Root Cause: The CDIF shapes use named property shape references (e.g., sh:property cdifd:nameProperty). When resolving these references, the code was passing propertyShapeRef.value (a string URI) to N3.Store.getQuads() instead of the term object itself. N3.js requires term objects, not strings, so the lookup failed.
Fix: Changed line 281 in cdi-shacl-helpers.js:
// Before (broken):
pathQuads = shaclShapesStore.getQuads(propertyShapeRef.value, ...)
// After (fixed):
pathQuads = shaclShapesStore.getQuads(propertyShapeRef, ...)Result: CDIF properties are now correctly recognized with blue "SHACL-defined" badges.
Problem: Context handling code was duplicated across multiple files, fragile, and produced confusing warnings like "No context for prov, using DDI-CDI".
Root Cause:
- Context resolution logic was copy-pasted in 4 different files
- Each implementation handled arrays/objects differently
- Failed to gracefully handle external ontology prefixes (like
prov:) - No fallback when external contexts failed to load
Fix: Created centralized context resolution utilities in cdi-json-ld-helpers.js:
-
resolvePrefix(context, prefix)- Safely resolves a prefix to namespace URI- Handles string, object, and array contexts uniformly
- Falls back to cached local DDI-CDI context
- Returns null for unknown prefixes (no false warnings)
-
expandCompactIri(context, compactIri)- Expands compact IRIs like "schema:Dataset"- Uses resolvePrefix internally
- Checks if already a full URI first
- Returns null if can't expand (caller decides how to handle)
-
loadLocalContext()- Loads and caches local DDI-CDI context- Provides fallback when external contexts fail
- Called at initialization (non-blocking)
-
Updated document loader in
cdi-shacl-loader.js:- Try working URL first
- Fall back to local
shapes/ddi-cdi.jsonld - Add 10-second timeout for external contexts
- Return empty context instead of failing completely
Updated files:
cdi-json-ld-helpers.js- Added centralized resolver functionscdi-shacl-helpers.js- Replaced 2 instances of context resolutioncdi-graph-helpers.js- Replaced 1 instanceproperty-suggestions.js- Replaced 1 instancecdi-shacl-loader.js- Enhanced document loader with fallbackscore.js- Added call to pre-load local context
Benefits:
- ✅ Single source of truth for context resolution
- ✅ Graceful handling of external ontologies (prov, dcterms, etc.)
- ✅ Robust fallback to local contexts
- ✅ No more confusing "No context for X" warnings
- ✅ Simpler, more maintainable code
- ✅ Better handling of network failures
Result: Context resolution is now stable and won't break when:
- External contexts are unavailable
- Array contexts are used
- External ontologies (prov, dcterms) are referenced
- Network is slow or fails