Skip to content

validation errors on drafts when running mlcroissant validate --jsonld #30

@pdurbin

Description

@pdurbin

When I run mlcroissant validate --jsonld src/test/resources/draft/out/croissant.json I get the following error:

E1120 12:11:09.643320 8594579776 validate.py:55] Found the following 1 error(s) during the validation:
  -  [Metadata(Draft Dataset)] ValueError({'Dates or DateTimes should follow the [ISO 8601 format](https://en.wikipedia.org/wiki/ISO_8601). Got '})
Found the following 2 warning(s) during the validation:
  -  [Metadata(Draft Dataset)] Property "https://schema.org/datePublished" is recommended, but does not exist.
  -  [Metadata(Draft Dataset)] WarningException("Version doesn't follow MAJOR.MINOR.PATCH: DRAFT. For more information refer to: https://semver.org/spec/v2.0.0.html")

This is for both mlcroissant==1.0.17 and .22, which I'm playing with in this PR:

I'm sure this was introduced when I added support for drafts in this PR:

I guess I didn't validate the JSON? Not sure.

Here's how the error looks at https://huggingface.co/spaces/JoaquinVanschoren/croissant-checker

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions