You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# CDIF Discovery Core Shapes for Browser-Based Validation
2
2
3
-
## Overview
3
+
## Summary
4
4
5
-
The CDI previewer uses **Core SHACL only** for validation. SPARQL-based SHACL features like `sh:SPARQLTarget`and `sh:SPARQLConstraint` are not supported in the browser-based application.
5
+
We've created **cdif-core.ttl**, a browser-compatible implementation of the CDIF Discovery SHACL shapes for validating schema.org Dataset metadata. The shapes validate 20 properties (4 mandatory + 16 recommended) and work with the lightweight Core SHACL validator, avoiding the need for a 1.9MB SPARQL engine.
6
6
7
-
## Technical Details
7
+
**Quick start:** Select "CDIF Discovery Core" from the shape dropdown in the previewer to validate your CDIF metadata. Properties conforming to the shapes will display with blue "SHACL-defined" badges.
8
8
9
-
Browser-based applications have strict constraints on bundle size and performance. Supporting SPARQL features like `sh:SPARQLTarget` and `sh:SPARQLConstraint` would require including a full SPARQL query engine in the browser.
9
+
## Background: CDIF Discovery Validation
10
+
11
+
CDIF Discovery shapes validate schema.org Dataset descriptions to ensure they contain essential metadata for data discovery. The original CDIF Discovery shapes used SPARQL-based SHACL features (`sh:SPARQLTarget`) for hierarchical node selection.
12
+
13
+
## How to Use
14
+
15
+
1. Open the CDI previewer with your JSON-LD file
16
+
2. Select **"CDIF Discovery Core"** from the shape dropdown
17
+
3. Click "Validate"
18
+
4. Review results:
19
+
- Red violations: Missing mandatory properties
20
+
- Orange warnings: Missing recommended properties
21
+
- Blue badges: SHACL-defined properties present
22
+
- Yellow badges: Extra properties not in shapes
23
+
24
+
## Testing Notes
25
+
26
+
**Status:** Ready for testing with CDIF Discovery metadata files
27
+
28
+
**Expected results:**
29
+
- Properties like `name`, `identifier`, `license`, `dateModified` should show blue "SHACL-defined" badges
30
+
- Missing mandatory properties trigger red violation messages
**Known good test:**`examples/cdi/se_na2so4-XDI-CDI-CDIF.jsonld` validates correctly with recognized properties
34
+
35
+
---
36
+
37
+
## Technical Reference: Why Core SHACL Only?
38
+
39
+
## Technical Reference: Why Core SHACL Only?
40
+
41
+
The CDI previewer uses **Core SHACL only** for validation. SPARQL-based SHACL features like `sh:SPARQLTarget` and `sh:SPARQLConstraint` are not supported.
42
+
43
+
**Important distinction:** The previewer previously used Comunica for executing `sh:SPARQLTarget` queries to identify which nodes to validate, but Comunica does **not perform SHACL validation** - it only executes SPARQL queries. We have not found a JavaScript SHACL validation library that supports SPARQL constraints (`sh:SPARQLConstraint`).
44
+
45
+
### Bundle Size Comparison
46
+
47
+
Browser-based applications have strict constraints on bundle size and performance. Even limited SPARQL support for node targeting would require including a full SPARQL query engine in the browser.
10
48
11
49
**The numbers:**
12
50
-**Current setup (Core SHACL only)**: ~400KB total
- Comunica QueryEngine: **1.9MB** (just for SPARQL!)
55
+
-**Previous setup with Comunica**: ~2.3MB total
56
+
- Comunica QueryEngine: **1.9MB** (for `sh:SPARQLTarget` support only)
19
57
- Plus all the Core SHACL libraries above
58
+
- Still no `sh:SPARQLConstraint` validation support
59
+
60
+
**What we tried:**
61
+
- ✅ Comunica can execute SPARQL queries to find nodes matching `sh:SPARQLTarget`
62
+
- ❌ Comunica cannot validate SHACL constraints
63
+
- ❌ rdf-validate-shacl (the JavaScript SHACL validator) does not support `sh:SPARQLConstraint`
64
+
- ❌ No other JavaScript library found that validates SPARQL-based SHACL constraints
20
65
21
-
Adding SPARQL would **increase the download size by 5-6x**, significantly slowing down the page load for all users, just to support a niche feature that Core SHACL can handle equally well.
66
+
Adding Comunica for `sh:SPARQLTarget` support would **increase the download size by 5-6x**, significantly slowing down the page load for all users, just to support hierarchical node targeting that Core SHACL can approximate.
22
67
23
68
**Technical reality:**
24
69
- SPARQL engines are complex (query parsing, optimization, execution)
25
70
- Comunica (the leading JavaScript SPARQL engine) is 1.9MB minified
71
+
- SHACL validation with SPARQL constraints requires a different tool
26
72
- Most SHACL shape files (including DDI-CDI Official) use Core SHACL only
27
-
- Core SHACL provides sufficient expressiveness for validation
73
+
- Core SHACL provides sufficient expressiveness for validation in most cases
28
74
29
-
**Our decision:** We've removed SPARQL support from the CDI previewer to keep it fast and lightweightfor all users.
75
+
**Our decision:** We removed SPARQL support to keep the previewer fast and lightweight. The 1.9MB cost for hierarchical node selection isn't justified when Core SHACL alternatives work well for real-world use cases.
30
76
31
-
##Core SHACL Alternatives to SPARQL Features
77
+
### Conversion Patterns
32
78
33
-
If you have SHACL shapes that use SPARQL features, here are the Core SHACL patterns that achieve the same goals:
79
+
### Conversion Patterns
34
80
35
-
### 1. Node Selection: Use `sh:targetClass` instead of `sh:SPARQLTarget`
81
+
When converting SPARQL-based shapes to Core SHACL, use these patterns:
82
+
83
+
#### Pattern 1: Node Selection with `sh:targetClass`
36
84
37
85
**Instead of:**
38
86
```turtle
@@ -52,9 +100,9 @@ sh:target [
52
100
sh:targetClass schema:Dataset ;
53
101
```
54
102
55
-
This is simpler, more efficient, and functionally equivalent.
103
+
This is simpler, more efficient, and functionally equivalent for most cases.
56
104
57
-
###2. RDF List Validation: Use `sh:node`with recursion instead of `sh:SPARQLConstraint`
105
+
#### Pattern 2: RDF List Validationwith `sh:node`
58
106
59
107
**Instead of:**
60
108
```turtle
@@ -94,19 +142,70 @@ ex:RDFListOfAgentsShape
94
142
95
143
This Core SHACL pattern validates lists of any length and works in both browser and server environments.
96
144
145
+
## CDIF Discovery Core Shapes
146
+
147
+
We created **cdif-core.ttl** as a browser-compatible alternative to the SPARQL-based CDIF Discovery shapes. This file is available in the previewer as the "CDIF Discovery Core" option.
148
+
149
+
### What We Converted
150
+
151
+
**Original CDIF Discovery shapes** (rules.shacl):
152
+
- Used `sh:SPARQLTarget` to select nodes hierarchically
153
+
- 2 shapes: `CDIFDatasetMandatoryShape` and `CDIFMetaMetadataShape`
154
+
- 4 mandatory properties: `identifier`, `name`, `license` or `conditionsOfAccess`, `dateModified`
0 commit comments