Skip to content

Support nested em rdf export#2558

Open
whomingbird wants to merge 6 commits into
seek-1.18from
support-nested-em-rdf-export
Open

Support nested em rdf export#2558
whomingbird wants to merge 6 commits into
seek-1.18from
support-nested-em-rdf-export

Conversation

@whomingbird
Copy link
Copy Markdown
Contributor

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds proper RDF serialization for nested ExtendedMetadata (LinkedExtendedMetadata / LinkedExtendedMetadataMulti) by emitting blank nodes and recursively exporting nested attributes, addressing seek/issues/2557 and improving the queryability of exported RDF.

Changes:

  • Refactors extended-metadata RDF export to recursively emit nested attributes as blank nodes (instead of stringifying nested hashes).
  • Adds factories and unit tests covering nested EMT export (single + multi), PID-skipping behavior, and scalar datatype literal emission.
  • Updates an existing Study RDF test to assert that nil EMT values do not produce RDF triples.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
lib/seek/rdf/rdf_generation.rb Implements recursive EMT RDF emission with blank nodes and typed literals for scalar base types.
test/unit/rdf_generation_test.rb Adds unit tests for nested EMT RDF export and scalar XSD datatype literal behavior.
test/factories/extended_metadata_types.rb Adds EMT/EMA factories to construct nested EMT shapes for RDF tests.
test/factories/sample_attribute_types.rb Adds a date SampleAttributeType factory to support date typing tests.
test/unit/study_test.rb Updates expectations so nil EMT values do not emit RDF triples.
db/seeds/extended_metadata_drafts/study_nested_emt_rdf_example.seeds.rb Adds a seed example showcasing nested EMT RDF export structures.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

next unless attribute.pid.present?

value = data[attribute.accessor_name]
next if value.nil?
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

emit_emt_attributes only skips nil values. For scalar types this can lead to incorrect RDF when the stored value is blank (e.g. optional integer/float/boolean fields often persist as ""), because typed_rdf_literal coerces "" to 0/0.0 or produces an invalid xsd:boolean literal. Consider skipping values that the attribute considers blank (e.g. attribute.test_blank?(value) / value.blank?) before calling typed_rdf_literal.

Suggested change
next if value.nil?
next if value.nil?
next if attribute.respond_to?(:test_blank?) ? attribute.test_blank?(value) : value.blank?

Copilot uses AI. Check for mistakes.
puts 'Created study_rdf_example EMT'
end
end
# rubocop:enable Metrics/BlockLength
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seeds file ends with # rubocop:enable Metrics/BlockLength but there is no corresponding # rubocop:disable Metrics/BlockLength at the top. As written, the block-length cop will still run for this long file and the lone enable directive is likely unintended. Either add the matching rubocop:disable header (and keep the enable), or remove the enable line entirely.

Suggested change
# rubocop:enable Metrics/BlockLength

Copilot uses AI. Check for mistakes.
Comment on lines +206 to +212
def typed_rdf_literal(attribute, value)
case attribute.sample_attribute_type&.base_type
when Seek::Samples::BaseType::DATE
RDF::Literal(value.to_s, datatype: RDF::XSD.date)
when Seek::Samples::BaseType::DATE_TIME
RDF::Literal(value.to_s, datatype: RDF::XSD.dateTime)
when Seek::Samples::BaseType::INTEGER
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typed_rdf_literal emits xsd:date/xsd:dateTime literals using value.to_s. However the date/date-time validators accept many non-XSD lexical formats (e.g. "2 Feb 2015" or "Thu, 11 Feb 2016 15:39:55 +0000"), which would become invalid RDF typed literals when exported. To ensure valid RDF, consider normalizing these values before emitting (e.g. parse then iso8601 for dateTime and Date.parse(...).iso8601 for date).

Copilot uses AI. Check for mistakes.
@stuzart stuzart added this to the 1.18.0 milestone May 11, 2026
@stuzart stuzart moved this to In review in SEEK 1.18.x May 11, 2026
@stuzart stuzart changed the base branch from main to seek-1.18 May 13, 2026 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In review

Development

Successfully merging this pull request may close these issues.

3 participants