Conversation
Summary -- This is a first step toward adding mdtests for Ruff. I actually wrote the code in the opposite order, first copy-pasting `ty_test` to a `ruff_test` crate, and then factoring out the shared code, but I figured it would be easier to review in this order. I'll open a stacked PR with the `ruff_test` changes shortly after this one to show that the API works well for that too. The main change here is moving several of the modules from `ty_test` to a new `mdtest` crate: - assertion.rs - diagnostic.rs - matcher.rs - parser.rs These files required few changes, with a couple of exceptions noted below. Unfortunately, this also required moving the `config` and `db` modules to support the `Matcher` tests. Ideally these would live in `ty_test` instead since `ruff_test` uses a slightly different `Db` and configuration schema, but again they are used by the current `Matcher` tests, and that seemed like a bigger refactor that we could defer to later. Beyond moving these files to the new crate, I made `Matcher` functions take a `&dyn Db` to support passing a different concrete type from `ruff_test`, and I also made the parser generic over an `MdtestConfig` trait to allow Ruff to use a separate config struct. The lib.rs file from `ty_test` was essentially split in half, with the shared code moved to the `mdtest` crate and the ty-specific parts kept in `ty_test`. Test Plan -- All existing mdtests and the unit tests from `ty_test` should still pass, and the stacked branch with the `ruff_test` crate tests the split API
Typing conformance resultsNo changes detected ✅Current numbersThe percentage of diagnostics emitted that were expected errors held steady at 87.94%. The percentage of expected errors that received a diagnostic held steady at 83.36%. The number of fully passing files held steady at 79/133. |
Memory usage reportMemory usage unchanged ✅ |
|
|
Summary -- This PR adds the `ruff_test` crate, a parallel crate to `ty_test` for Ruff, to enable the new mdtests in `ruff_linter`. I opted to follow the `Db`-based structure of `ty_test` to simplify the integration, but we end up basically just unpacking the files from the `Db` to call the `test_contents` function from the linter. Currently stacked on #24616 Test Plan -- I copied over UP046_0.py into a new mdtest. This was selected basically at random, and I should probably either pick a shorter first test to port, or also port over the other UP046 tests and take better advantage of the mdtest format. Currently the whole file is just in a single code block, but it demonstrates the basic functionality, including config loading and diagnostic snapshotting.
Summary -- This PR adds the `ruff_test` crate, a parallel crate to `ty_test` for Ruff, to enable the new mdtests in `ruff_linter`. I opted to follow the `Db`-based structure of `ty_test` to simplify the integration, but we end up basically just unpacking the files from the `Db` to call the `test_contents` function from the linter. Currently stacked on #24616 Test Plan -- I copied over UP046_0.py into a new mdtest. This was selected basically at random, and I should probably either pick a shorter first test to port, or also port over the other UP046 tests and take better advantage of the mdtest format. Currently the whole file is just in a single code block, but it demonstrates the basic functionality, including config loading and diagnostic snapshotting.
MichaReiser
left a comment
There was a problem hiding this comment.
Nice. I think we should try to reduce the mdtest's dependencies and make it less dependent on ty_python_semantic.
It's also unclear to me whether the Db concept and abstractions make sense, given that ruff doesn't use a db at all.
| #[derive(Deserialize, Debug, Default, Clone)] | ||
| #[serde(rename_all = "kebab-case", deny_unknown_fields)] | ||
| pub(crate) struct MarkdownTestConfig { | ||
| pub struct MarkdownTestConfig { |
There was a problem hiding this comment.
I don't think it makes sense to have a shared markdown configuration. I consider the supported configuration options to be very specific to a specific test runner. E.g. the analysis options and most environment options are specific to ty and not relevant for ruff. That's why I think that the concrete option type should live in ty_test, but there might be shared options, like the external dependencies that lives in the shared mdtest crate.
There was a problem hiding this comment.
We can probably chat more about this in our 1:1, but including this config type and the Db and thus the ty_* dependencies were mostly motivated by the tests in the matcher and parser. The matcher tests need a concrete Db type, and the parser tests need a concrete configuration type, so the shortest path to that seemed like just continuing to share them with ty_test, at least for now. The Ruff version has its own configuration and Db types, which would be preferable for ty_test too, as you say.
I'll think more about how to separate these items from the tests.
| ty_module_resolver = { workspace = true } | ||
| ty_python_semantic = { workspace = true, features = ["serde", "testing"] } | ||
| ty_vendored = { workspace = true } | ||
| ty_python_core = { workspace = true } |
There was a problem hiding this comment.
I think we want to make the generic mdtest crate independent of most ty crates. Not only does this help with faster compile times (compiling ty_python_semantic takes forever), but it also makes it more reusable without introducing cyclic dependencies.
| #[salsa::db] | ||
| #[derive(Clone)] | ||
| pub(crate) struct Db { | ||
| pub struct Db { |
There was a problem hiding this comment.
Instead of defining the db struct here, I think it's better to define a Db trait with the accessors you need. Downstream crates can then define their own Db struct.
I haven't looked at Ruff's implementation but does it even use the Db struct?
| /// Run `path` as a markdown test suite with given `title`. | ||
| /// | ||
| /// Panic on test failure, and print failure details. | ||
| pub fn run( |
There was a problem hiding this comment.
It would be nice if we could find a way to share some of the run and run_test logic. There's a lot happening in those methods that isn't specific to ty_test. It's probably sufficient if we have a few trait methods like check_file that, given a file, return a list of Diagnostics that are sufficient abstractions. But we can also decide to leave this for a later PR.
|
It turned out to be a lot easier to break the ty dependency that I had feared/expected. I still need to do another cleanup pass, but I introduced a very minimal |
Summary -- This PR adds the `ruff_test` crate, a parallel crate to `ty_test` for Ruff, to enable the new mdtests in `ruff_linter`. I opted to follow the `Db`-based structure of `ty_test` to simplify the integration, but we end up basically just unpacking the files from the `Db` to call the `test_contents` function from the linter. Currently stacked on #24616 Test Plan -- I copied over UP046_0.py into a new mdtest. This was selected basically at random, and I should probably either pick a shorter first test to port, or also port over the other UP046 tests and take better advantage of the mdtest format. Currently the whole file is just in a single code block, but it demonstrates the basic functionality, including config loading and diagnostic snapshotting.
|
Thanks for the review and for the conversation in our 1:1 earlier too! I think I've mostly addressed your earlier comments by adding the
Codex and I are still poking at this part. It also seems more promising to switch to a |
Summary -- This PR adds the `ruff_test` crate, a parallel crate to `ty_test` for Ruff, to enable the new mdtests in `ruff_linter`. I opted to follow the `Db`-based structure of `ty_test` to simplify the integration, but we end up basically just unpacking the files from the `Db` to call the `test_contents` function from the linter. Currently stacked on #24616 Test Plan -- I copied over UP046_0.py into a new mdtest. This was selected basically at random, and I should probably either pick a shorter first test to port, or also port over the other UP046 tests and take better advantage of the mdtest format. Currently the whole file is just in a single code block, but it demonstrates the basic functionality, including config loading and diagnostic snapshotting.
|
Alright, I got something working without a I kind of don't mind the |
MichaReiser
left a comment
There was a problem hiding this comment.
This split looks reasonable for a first iteration. It does have the downside that new mdtest feature need to be added to each test runner. For example, the code for capturing and updating snapshot still lives in ty_test. This is probably fine for an initial version but it might be interesting to explore how much work it is to lift more of run and run_test out of ty_test and into the mdtest crate
There was a problem hiding this comment.
I don't mind this too much, but it feels a bit funny that this happens within the parser. Is this something we could validate within run_test (or run?)
There was a problem hiding this comment.
Yeah, this felt funny to me too and required the config trait. I think one of the parser tests asserts on this error, which is one of the reasons I left it here, but I can try to drop or move that test too. Otherwise this would feel a lot better in ty_test since Ruff doesn't have dependencies.
There was a problem hiding this comment.
This is actually kind of tricky to validate outside the parser because of the config inheritance from parent Sections. A loop like this over suite.tests(), which I tried first, has false positives because each child section inherits the parent config:
let mut file_has_dependencies = false;
for test in suite.tests() {
if test.configuration().dependencies().is_some() {
if file_has_dependencies {
bail!(
"Multiple sections with `[project]` dependencies in the same file are not allowed. \
External dependencies must be specified in a single top-level configuration block."
);
}
file_has_dependencies = true;
}
}That failed for this test, for example.
Instead of looping over tests, I think we could loop over Sections and add an inherited_config flag, but then we have to expose that type and an iterator method. I opted for passing a callback to the parser to validate each parsed Config instead, but that's also kind of awkward, so I'd be curious to get your thoughts, especially in case I'm missing something obvious.
The callback at least keeps the check directly in ty_test and removes the trait method.
|
|
||
| let mut db = db::Db::setup(); | ||
| let mut db = Db::setup(); | ||
| let mut markdown_edits = vec![]; |
There was a problem hiding this comment.
I'm fine if we leave this to a separate PR, but it would be nice if this run loop could be shared between crates. It would probably allow us to keep many types pub(crate) again
| let failure = match matcher::match_file(db, test_file.file, &diagnostics).and_then( | ||
| |inline_diagnostics| { | ||
| validate_inline_snapshot( | ||
| mdtest::validate_inline_snapshot( |
There was a problem hiding this comment.
Same here. I feel like the code iterating over test_files and what comes below is mostly agnostic to what's being tested. It would be nice if some of it coudl be shared.
| @@ -744,287 +652,6 @@ impl std::fmt::Display for ModuleInconsistency<'_> { | |||
| } | |||
| } | |||
|
|
|||
There was a problem hiding this comment.
You may want to bring attempt_test to mdtest. It gracefully handles the case where checking a file panics (or linting). It ensurs that other tests in the same file run to completion.
There was a problem hiding this comment.
Ah good idea. I think I'll try to group this with the run/run_test refactor that I'll follow up on. I agree with your points on that front too and have some stashed changes locally that I hope will work.
Summary -- This PR adds the `ruff_test` crate, a parallel crate to `ty_test` for Ruff, to enable the new mdtests in `ruff_linter`. I opted to follow the `Db`-based structure of `ty_test` to simplify the integration, but we end up basically just unpacking the files from the `Db` to call the `test_contents` function from the linter. Currently stacked on #24616 Test Plan -- I copied over UP046_0.py into a new mdtest. This was selected basically at random, and I should probably either pick a shorter first test to port, or also port over the other UP046 tests and take better advantage of the mdtest format. Currently the whole file is just in a single code block, but it demonstrates the basic functionality, including config loading and diagnostic snapshotting.
Summary -- This PR adds the `ruff_test` crate, a parallel crate to `ty_test` for Ruff, to enable the new mdtests in `ruff_linter`. I opted to follow the `Db`-based structure of `ty_test` to simplify the integration, but we end up basically just unpacking the files from the `Db` to call the `test_contents` function from the linter. Currently stacked on #24616 Test Plan -- I copied over UP046_0.py into a new mdtest. This was selected basically at random, and I should probably either pick a shorter first test to port, or also port over the other UP046 tests and take better advantage of the mdtest format. Currently the whole file is just in a single code block, but it demonstrates the basic functionality, including config loading and diagnostic snapshotting.
Summary -- This PR adds the `ruff_test` crate, a parallel crate to `ty_test` for Ruff, to enable the new mdtests in `ruff_linter`. I opted to follow the `Db`-based structure of `ty_test` to simplify the integration, but we end up basically just unpacking the files from the `Db` to call the `test_contents` function from the linter. Currently stacked on #24616 Test Plan -- I copied over UP046_0.py into a new mdtest. This was selected basically at random, and I should probably either pick a shorter first test to port, or also port over the other UP046 tests and take better advantage of the mdtest format. Currently the whole file is just in a single code block, but it demonstrates the basic functionality, including config loading and diagnostic snapshotting.
Summary
This is a first step toward adding mdtests for Ruff. I actually wrote the code
in the opposite order, first copy-pasting
ty_testto aruff_testcrate, and thenfactoring out the shared code, but I figured it would be easier to review in
this order. I also opened a stacked PR with the
ruff_testchanges (#24617)to show that the API works well for that too.
The main change here is moving several of the modules from
ty_testto a newmdtestcrate:assertiondiagnosticmatcherparserBeyond moving these files to the new crate, I made
Matcherfunctions take a&dyn Dbto support passing a different concrete type fromruff_test, and Ialso made the parser generic over an
MdtestConfigtrait to allow Ruff to use aseparate config struct. I also introduced new
TestConfigandTestDbtypes to allowtesting the
matcherandparserwithin themdtestcrate without dependingon either the real ty
Dborty_testconfig type.The lib.rs file from
ty_testwas essentially split in half, with the sharedcode moved to the
mdtestcrate and the ty-specific parts kept inty_test.Test Plan
All existing mdtests and the unit tests from
ty_testshould still pass, andthe stacked branch with the
ruff_testcrate tests the split API