Add modeling of Protocol test details to manifest by kasei · Pull Request #312 · w3c/rdf-tests

kasei · 2026-03-27T01:12:30Z

This adds modeling for all the existing Protocol tests based heavily on Gregg's work in #79 (which I think has diverged enough from the current manifest as a result of #306 that it's not worth trying to reconcile).

I am reasonably confident that this new data is a faithful encoding as they were produced automatically by parsing the HTTP requests in the previous rdfs:comment strings and modeling the resulting data. I used some regex matching to figure out the input for the expected results modeling.

There are a few of differences from Gregg's original model:

I do not provide an extra layer of modeling for headers (for example, "application/sparql-query; charset=UTF-16" is not also broken down into the header name and parameter elements)
I do not use ht:statusCodeValue with literal values like "4XX" (more on this below), instead using values such as ht:StatusCode2xx with a new mf:expectedStatus property
For expected results, I have moved away from the tests enumerating the acceptable media types (which seems fraught and likely to lag new, valid formats) and instead simply declare the type of results expected (boolean, tabular, or RDF; more on this below)

I introduce four new manifest terms to model the expected results:

mf:expectedStatus - Expected HTTP status code (pointing to values like ht:StatusCode2xx); this differs from Gregg's model which used _:response ht:statusCodeValue "2XX" which did not conform to the semantics of ht:statusCodeValue, and was my biggest issue with Fix protocol manifest #79
mf:expectedBoolean - Expected results for ASK queries.
mf:expectedFormat - Expected serialization format of the results; the range here is one of the literals: "boolean", "tabular", or "RDF" (I'm open to changing this to a controlled set of IRIs if desired and with suggestions on where such IRIs might live)
mf:expectation - A textual description of the expected results for the singel test update_base_uri where modeling the expectation would have been prohibitive. (Changing the test to check the expectation as a FILTER in the query and then using ASK might be a better approach here.)

I also (ab)use the SPARQL Update ut:graphData modeling to indicate the named graphs that should be loaded into the Dataset before the test is run:

ut:graphData [ ut:graph <data1.nt> ; rdfs:label "http://kasei.us/2009/09/sparql/data/data1.rdf" ] ;
ut:graphData [ ut:graph <data2.nt> ; rdfs:label "http://kasei.us/2009/09/sparql/data/data2.rdf" ] ;

For now those ut:graphData properties are hanging off of the test itself, which feels a bit strange, but it didn't feel any better to hang them off of the mf:action which in this manifest is a ht:Request. This is another area where I'm open to suggestions.

Finally, I made a few substantive (but I expect uncontroversial) changes to a few tests. For the following tests, I removed the requirement to return results in SPARQL XML format (removing the Accept header in the request, and the expected Content-Type of the result)

update_dataset_default_graphs
update_base_uri
update_dataset_default_graph
update_dataset_named_graphs
update_dataset_full

This avoids the use of form-url encoding and makes the query/update strings easier to read.

kasei · 2026-03-27T01:18:27Z

I wrote a ~200 line perl test runner based on this new manifest data. The dependency chain is very large, so may not be super useful for people without a normal perl setup. YMMV.

kasei · 2026-03-27T01:20:58Z

I've also left the old text-based rdfs:comment values that describe the request-response expectations to make it easier to review the new modeling. In the future we might want to remove those.

@afs – I tried using this code against Fuseki and got 3 failures. One is expected (update_base_uri for which the modeling doesn't provide details on how to evaluate whether it passed), but I couldn't immediately figure out whether the other two (bad_update_dataset_conflict and query_multiple_dataset) were an issue with my setup/use or something deeper.

Tpt

Thank you for this work! I have not reviewed carefully the content yet. Just found a detail

afs · 2026-03-29T11:22:51Z

@afs – I tried using this code against Fuseki and got 3 failures. One is expected (update_base_uri for which the modeling doesn't provide details on how to evaluate whether it passed), but I couldn't immediately figure out whether the other two (bad_update_dataset_conflict and query_multiple_dataset) were an issue with my setup/use or something deeper.

What errors do you get?
Do you have the Fuseki log? (if run, "-v" all HTP headers are printed as well) and what's the Fuseki config look like?
Is /sparql/ the whole path? (the databset name) In such case all operations are on that URL.

Did you start the server fuseki --mem /sparql?

Trying to work out what is actually sent so I have have misunderstood ...

Looking at query_multiple_dataset --

ht:absolutePath - does that mean it is sent to a URL with ?named-graph-uri= and also content header of application/sparql-query (+body?)

It's two operations superimposed. I think that it will dispatch to query (that happens to come before GSP - a legal choice would also be GSP and then conneg error on the content type).

Query checks the query string and ?named-graph-uri= is illegal. Do you see Malformed request: unrecognized parameters: default-graph-uri= error response?

ASK FROM <http://kasei.us/2009/09/sparql/data/data3.rdf> { GRAPH ?g1 { <http://kasei.us/2009/09/sparql/data/data1.rdf> a ?type } GRAPH ?g2 { <http://kasei.us/2009/09/sparql/data/data2.rdf> a ?type } }

FWIW the ASK is false - a dataset description is complete - the FROM sets the default graph there are no named graphs unless FROM NAMED is used as well.

bad_update_dataset_conflict

The request is ?using-named-graph-uri= but then there is a custom dataset and it does not have a named graph http://example/addresses so WITH may be the problem.

https://www.w3.org/TR/sparql12-protocol/#update-dataset says it's an error. 400.

kasei · 2026-03-29T22:47:21Z

@afs –

I get these errors:

#query_multiple_dataset
#update_base_uri
#bad_update_dataset_conflict

I used fuseki quickstart, creating a "test" dataset, and using http://localhost:3030/test as the endpoint. (I think this is an acceptable way to use one endpoint for both query and update? If not, I'd have to adjust my testing code to differentiate query and update endpoints).

As mentioned, I think #update_base_uri is expected failure.

For #query_multiple_dataset:

ht:absolutePath - does that mean it is sent to a URL with ?named-graph-uri= and also content header of application/sparql-query (+body?)

Yes. Dataset specified in the HTTP query parameters, content-type specifying SPARQL Query, and the request body with the ASK query.

It's two operations superimposed. I think that it will dispatch to query (that happens to come before GSP - a legal choice would also be GSP and then conneg error on the content type).

I don't understand "two operations superimposed". I don't think there's anything GSP-related going on here.

Query checks the query string and ?named-graph-uri= is illegal. Do you see Malformed request: unrecognized parameters: default-graph-uri= error response?

No error. It returns a 200 with SRX encoding of false.

FWIW the ASK is false - a dataset description is complete - the FROM sets the default graph there are no named graphs unless FROM NAMED is used as well.

I don't think that's true. The FROM should be overridden by the dataset specified by the protocol (via HTTP query parameters). That's the entire point of this test.

For #bad_update_dataset_conflict:

This one should be an error because of the mixed use of USING in the update and using-named-graph-uri in the protocol.

The test as written (and approved by the previous WG) expects a 4xx error. I think that's the correct thing here, but the spec text doesn't actually specify this (saying only that this is "an error"). Fuseki returns a 500, so disagrees with the test, but not technically with the spec text.

afs · 2026-03-30T11:58:26Z

I used fuseki quickstart, creating a "test" dataset, and using http://localhost:3030/test as the endpoint. (I think this is an acceptable way to use one endpoint for both query and update? If not, I'd have to adjust my testing code to differentiate query and update endpoints).

Yes, there would be both query and update if started with --update or an empty memory model.

fuseki-server --mem /test
fuseki-server --file=DATA --update /test and variants

How are you setting up the data?

If started with --file, it is read-only by default.

The test says: ht:absolutePath "/sparql/..." - where does that fit in? /test/sparql will also exist but is query only (there is also /test/update but /test has both.

(FWIW fuseki-server --mem / should work to give a no-path dataset, then /sparql is query only by default.)

If you could send me the log file (printed to stdout) I can see the setup details and data loading, ideally, with --verbose which prints detailed setup and detailed requests.

This is Fuseki 6.0.0?

I didn't look at update_base_uri I read "One is expected (update_base_uri..." as saying it was test-correct.

Use SILENT form of CLEAR GRAPH to ensure graph is empty before the test runs, regardless of whether the graph already exists.

kasei · 2026-03-30T17:00:37Z

How are you setting up the data?

Combination of DROP ALL and INSERT DATA for the ut:graphData entries (can see actual code in the linked test runner above.).

The test says: ht:absolutePath "/sparql/..." - where does that fit in? /test/sparql will also exist but is query only (there is also /test/update but /test has both.

My test runner is replacing /sparql/ with the path from the endpoint URL supplied to the test runner. It feels a bit strange, but I think that's probably required if we continue to use ht:absolutePath. I don't think there's an alternative property for non-absolute path specification (I could be wrong about this).

This is Fuseki 6.0.0?

Yes.

I didn't look at update_base_uri I read "One is expected (update_base_uri..." as saying it was test-correct.

I manually validated the test. Fuseki works as expected. The test runner just doesn't know that because the validation logic isn't encoded in the manifest. I'd like to follow up this PR with a change to this test to make it simpler for test runners.

I am pushing a small fix for #update_base_uri so that the setup code uses CLEAR SILENT GRAPH instead of CLEAR GRAPH. This was causing problems with Fuseki as the graph didn't already exist.

afs · 2026-04-04T15:37:41Z

(I have a log file from @kasei)

I extracted the requests for the log and then used curl to send the requests.

== bad_update_dataset_conflict

Fuseki bug fixed - I now get 400, and not 500.

== query_multiple_dataset

The query request is

http://localhost:3030/test?named-graph-uri=http%3A%2F%2Fkasei.us%2F2009%2F09%2Fsparql%2Fdata%2Fdata1.rdf%26named-graph-uri%3Dhttp%3A%2F%2Fkasei.us%2F2009%2F09%2Fsparql%2Fdata%2Fdata2.rdf

It has ? and first = are unencoded, while the & (%26) and the second = (%3D) are URL-encoded.

There is one very long name.

If I do not encode & and second =, the ASK query returns true.

== update_base_uri

This test has a security problem.

The execution may be on a machine behind a firewall.

Depending on proxy/gateway protocol, the URL at point of execution has the local machine name/IP address, not that of the public side of the gateway. (A second problem is that update and query might be separate URLs.)

Fuseki uses a fixed, dummy base name http://server/unset-base to parse queries and updates so that the local host machine details are not visible.

I get a result set with:

?o="http://server/unset-base/test"

Hacking the code to use the servlet URL, and it is http://localhost:3030/test.

kasei · 2026-04-04T16:31:39Z

It has ? and first = are unencoded, while the & (%26) and the second = (%3D) are URL-encoded.

Good catch. Fixed (along with the addition of an rdfs:comment on the manifest itself, describing some of the expectations and issues of running the tests).

== update_base_uri

This test has a security problem.

The execution may be on a machine behind a firewall.

Depending on proxy/gateway protocol, the URL at point of execution has the local machine name/IP address, not that of the public side of the gateway. (A second problem is that update and query might be separate URLs.)

I'm not sure I see the security issue here. The test is not trying to validate the specific IRI that is resolved, but only that the relative IRI *is resolved to some absolute IRI. So I think Fuseki is already passing the test as written (even if the expectations of the test are only provided in prose).

That being said, the more I think about this test, the more I'm of the opinion that it is not really a protocol test. This requirement surely applies to the Protocol and any other means of submitting an update to the service (e.g. API). I'd be happy to just remove this test entirely. Thoughts?

afs · 2026-04-04T20:10:34Z

only that the relative IRI *is resolved to some absolute IRI

True - I see it as encouraging/highlighting bad behavior.

(A system that rejected relative URIs wouldn't be such a bad thing.)

Thoughts?

Personally - remove the test.

afs · 2026-04-07T21:00:31Z

I suggest we merge this - it may get wider review that way.

kasei · 2026-04-07T21:18:06Z

@Tpt – any other comments before I merge?

Tpt

+1 to @afs let's merge this and iterate if needed

kasei added 6 commits March 26, 2026 09:51

Whitespace normalization.

b6132af

Update protocol tests to use direct POST encoding where possible.

ba4ab54

This avoids the use of form-url encoding and makes the query/update strings easier to read.

More whitespace fixes.

2942604

Semantic changes to a few tests.

9d19675

Add mf:action modeling for protocol tests.

c6737aa

Add modeling of the expected available dataset.

f9d7a9e

kasei assigned afs Mar 27, 2026

Tpt reviewed Mar 27, 2026

View reviewed changes

Comment thread sparql/sparql11/protocol/manifest.ttl Outdated

Fix typo of ht:Connection types.

735c3c1

Fix Protocol test #update_base_uri to properly clear graph.

8b93655

Use SILENT form of CLEAR GRAPH to ensure graph is empty before the test runs, regardless of whether the graph already exists.

kasei added 2 commits April 4, 2026 09:17

Fix URL encoding of path for #query_multiple_dataset.

75a44e1

Add rdfs:comment with some notes on the manifest and running the tests.

1fe85fe

afs removed their assignment Apr 7, 2026

afs approved these changes Apr 7, 2026

View reviewed changes

Tpt approved these changes Apr 9, 2026

View reviewed changes

kasei merged commit c7817c2 into main Apr 9, 2026
2 checks passed

This was referenced Apr 13, 2026

Fix incorrect Content-Length in test case text description #322

Open

Use modelled HTTP test specifications ad-freiburg/sparql-conformance#60

Open

kasei mentioned this pull request Apr 30, 2026

Fix protocol manifest #79

Closed

Conversation

kasei commented Mar 27, 2026

Uh oh!

kasei commented Mar 27, 2026

Uh oh!

kasei commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Tpt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

afs commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kasei commented Mar 29, 2026

Uh oh!

afs commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kasei commented Mar 30, 2026

Uh oh!

afs commented Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kasei commented Apr 4, 2026

Uh oh!

afs commented Apr 4, 2026

Uh oh!

afs commented Apr 7, 2026

Uh oh!

kasei commented Apr 7, 2026

Uh oh!

Tpt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kasei commented Mar 27, 2026 •

edited

Loading

afs commented Mar 29, 2026 •

edited

Loading

afs commented Mar 30, 2026 •

edited

Loading

afs commented Apr 4, 2026 •

edited

Loading