Skip to content

Commit c4dcfb4

Browse files
committed
Merge branch 'feature/cdi-previewer'
2 parents e37f12a + 9bfe5bf commit c4dcfb4

6 files changed

Lines changed: 510 additions & 9 deletions

File tree

Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
# `cdif_example.jsonld` – Minimal CDIF Discovery Example
2+
3+
This file is a **small, readable CDIF Discovery Core example** that mirrors the way we expect CDIF to be used with `schema.org` datasets and the CDIF Discovery Core SHACL shapes.
4+
5+
It is intended as a reference example for Steve and others when aligning shapes and JSON-LD instance documents.
6+
7+
## Structure
8+
9+
The example has a single `schema:Dataset` node:
10+
11+
- `@id`: a stable HTTPS URI for the dataset.
12+
- `@type`: `schema:Dataset`.
13+
- Core CDIF Discovery properties on the dataset:
14+
- `schema:name` – title of the dataset.
15+
- `schema:identifier` – local identifier string.
16+
- `schema:description` – human-readable description (now explicitly shaped as `cdifd:descriptionProperty`).
17+
- `schema:creator` – a `schema:Person` with `schema:name`.
18+
- `schema:datePublished` – ISO8601 date string (YYYY-MM-DD).
19+
- `schema:license` – IRI for the license.
20+
- `schema:keywords` – array of strings.
21+
- `schema:url` – landing page URL.
22+
- `schema:distribution` – a `schema:DataDownload` with `schema:name`, `schema:contentUrl`, `schema:encodingFormat`.
23+
24+
These correspond directly to CDIF Discovery Core shapes in `CDIF-Discovery-Core-Shapes.ttl`:
25+
26+
- `cdifd:resourceIdentifierProperty``schema:identifier`.
27+
- `cdifd:nameProperty``schema:name`.
28+
- `cdifd:descriptionProperty``schema:description`.
29+
- `cdifd:responsiblePartyProperty``schema:creator`.
30+
- `cdifd:datePublishedProperty``schema:datePublished`.
31+
- `cdifd:rightsProperty``schema:license` (via the `license / conditionsOfAccess` alternative path).
32+
- `cdifd:keywordsResourceProperty``schema:keywords`.
33+
- `cdifd:getResourceProperty``schema:url` / `schema:distribution`.
34+
- `cdifd:distributionProperty``schema:distribution`.
35+
36+
Because the example uses `"schema": "https://schema.org/"` in its `@context`, the expanded IRIs are exactly:
37+
38+
- `https://schema.org/name`
39+
- `https://schema.org/identifier`
40+
- `https://schema.org/description`
41+
- etc.
42+
43+
The CDIF shapes now also use **HTTPS schema.org** consistently, so SPARQL and SHACL can match these predicates exactly.
44+
45+
## How the previewer classifies fields
46+
47+
The CDI Previewer does the following:
48+
49+
1. **Normalize to `@graph`** if needed (here the file already has `@graph`).
50+
2. **Expand JSON-LD** to get full IRIs for properties (`https://schema.org/name`, etc.).
51+
3. **Run SPARQL targets** from CDIF Discovery shapes:
52+
- The `cdifd:CDIFDatasetRecommendedShape` has a `sh:SPARQLTarget` that selects all `schema:Dataset` instances.
53+
4. **Classify properties** for each dataset node:
54+
- It finds the applicable NodeShape(s) (e.g. `cdifd:CDIFDatasetRecommendedShape`).
55+
- For each `sh:property` in that NodeShape, it looks at the `sh:path` and compares it to the expanded property URI.
56+
- If they match, the field is marked **REQUIRED** (if `sh:minCount > 0`) or **OPTIONAL**; otherwise it is **EXTRA**.
57+
58+
For `cdif_example.jsonld`, all of the core fields listed above show up as **blue** (SHACL-defined) in the previewer, with **REQUIRED** or **OPTIONAL** badges according to the CDIF Discovery shapes.
59+
60+
## How this relates to Steve's examples
61+
62+
Steve's richer CDI/XAS examples (`FeXAS_...jsonld`, `se_na2so4-...jsonld`) use the *same* schema.org properties on a `schema:Dataset` node:
63+
64+
- `schema:name`
65+
- `schema:identifier`
66+
- `schema:description`
67+
- `schema:license`
68+
- `schema:distribution`
69+
- `schema:keywords`
70+
- `schema:variableMeasured`
71+
- `schema:subjectOf` / `dcterms:conformsTo`
72+
73+
The CDIF Discovery Core shapes now:
74+
75+
- Use HTTPS `https://schema.org/` everywhere.
76+
- Select all `schema:Dataset` nodes via SPARQL (no root-only filter).
77+
- Include `cdifd:descriptionProperty` for `schema:description`.
78+
- Include `cdifd:variableMeasuredProperty` in the main dataset NodeShape, so `schema:variableMeasured` is SHACL-defined.
79+
80+
That means the **same properties** that are blue in this minimal example are the ones we would *like* to see as blue on Steve's datasets as well:
81+
82+
- Name, identifier, description, license, keywords, distribution, variableMeasured, etc.
83+
84+
## Feedback for Steve
85+
86+
When updating CDIF Discovery shapes and examples, this file demonstrates a few key points:
87+
88+
1. **Use HTTPS schema.org consistently**
89+
- In JSON-LD contexts: `"schema": "https://schema.org/"`.
90+
- In SHACL/Turtle: `@prefix schema: <https://schema.org/> .`
91+
- In SPARQL and `sh:prefixes`: always `https://schema.org/`.
92+
93+
2. **Don't filter out referenced datasets in SPARQL targets**
94+
- The original `NOT EXISTS { ?s ?p ?this . }` filter excluded datasets that are referenced elsewhere in the graph (which realistic CDI examples do).
95+
- Removing this filter lets CDIF Discovery target any `schema:Dataset` node, including those linked via `schema:subjectOf`, `schema:about`, etc.
96+
97+
3. **Model core metadata on the dataset using schema.org keys**
98+
- `schema:name`, `schema:identifier`, `schema:description`, `schema:license`, `schema:keywords`, `schema:distribution`, `schema:variableMeasured`.
99+
- These align directly with CDIF Discovery property shapes.
100+
101+
4. **Keep examples readable**
102+
- This file is intentionally small so people can see, at a glance, which properties CDIF Discovery expects and how they map to the SHACL shapes.
103+
104+
If your shapes and examples follow the same patterns as in `cdif_example.jsonld`, the CDI Previewer (and other SHACL engines) will be able to classify fields reliably as CDIF-defined instead of EXTRA.
105+
106+
### Note on small example fixes
107+
108+
While reviewing Steve's FeXAS example (`FeXAS_Fe_c3d.001-NEXUS-HDF5-cdi-CDIF.jsonld`), we also fixed a minor typo where one nested variable had `schame:alternateName` instead of `schema:alternateName`. This is now corrected so that all `schema:alternateName` occurrences use the proper `schema` prefix, matching the "schema": "https://schema.org/" context above.
109+
110+
We have adjusted the CDIF Discovery shapes and previewer so that Steve's dataset *types* are recognized correctly via SPARQL targets and HTTPS schema.org IRIs. However, some of Steve's dataset properties still show up as EXTRA rather than SHACL-defined. The intention of this example and the shapes is clear, but there is still follow-up work needed to get perfect alignment between the CDIF shapes, the previewer classification logic, and Steve's richer CDI/XAS patterns.

previewers/betatest/CdiPreview.html

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -154,10 +154,12 @@ <h4 class="modal-title">
154154
<script src="js/cdi-preview/cdi-shacl-sparql.js"></script>
155155
<script src="js/cdi-preview/core.js"></script>
156156
<script src="js/cdi-preview/cdi-json-ld-helpers.js"></script>
157+
<!-- SHACL UI helpers (classifyProperty, enums) used by render & suggestions -->
158+
<script src="js/cdi-preview/cdi-shacl-helpers.js"></script>
159+
<script src="js/cdi-preview/validation.js"></script>
157160
<script src="js/cdi-preview/render.js"></script>
158161
<script src="js/cdi-preview/property-suggestions.js"></script>
159162
<script src="js/cdi-preview/data-extraction.js"></script>
160-
<script src="js/cdi-preview/validation.js"></script>
161163
<script src="js/cdi-preview/event-handlers.js"></script>
162164
<script src="js/cdi-preview/cdi-graph-helpers.js"></script>
163165
</body>

0 commit comments

Comments
 (0)