Skip to content

Commit d554650

Browse files
authored
[ty] Improve resolution of absolute imports in tests (#21817)
By teaching desperate resolution to try every possible ancestor that doesn't have an `__init__.py(i)` when resolving absolute imports. * Fixes astral-sh/ty#1782
1 parent 3ac58b4 commit d554650

4 files changed

Lines changed: 174 additions & 61 deletions

File tree

crates/ty/tests/cli/python_environment.rs

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2390,14 +2390,14 @@ fn default_root_flat_layout() -> anyhow::Result<()> {
23902390
fn default_root_tests_folder() -> anyhow::Result<()> {
23912391
let case = CliTest::with_files([
23922392
("src/foo.py", "foo = 10"),
2393-
("tests/bar.py", "bar = 20"),
2393+
("tests/bar.py", "baz = 20"),
23942394
(
23952395
"tests/test_bar.py",
23962396
r#"
23972397
from foo import foo
2398-
from bar import bar
2398+
from bar import baz
23992399
2400-
print(f"{foo} {bar}")
2400+
print(f"{foo} {baz}")
24012401
"#,
24022402
),
24032403
])?;

crates/ty_project/src/metadata/options.rs

Lines changed: 0 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -285,22 +285,6 @@ impl Options {
285285
roots.push(python);
286286
}
287287

288-
// Considering pytest test discovery conventions,
289-
// we also include the `tests` directory if it exists and is not a package.
290-
let tests_dir = project_root.join("tests");
291-
if system.is_directory(&tests_dir)
292-
&& !system.is_file(&tests_dir.join("__init__.py"))
293-
&& !system.is_file(&tests_dir.join("__init__.pyi"))
294-
&& !roots.contains(&tests_dir)
295-
{
296-
// If the `tests` directory exists and is not a package, include it as a source root.
297-
tracing::debug!(
298-
"Including `./tests` in `environment.root` because a `./tests` directory exists"
299-
);
300-
301-
roots.push(tests_dir);
302-
}
303-
304288
// The project root should always be included, and should always come
305289
// after any subdirectories such as `./src`, `./tests` and/or `./python`.
306290
roots.push(project_root.to_path_buf());

crates/ty_python_semantic/resources/mdtest/import/workspaces.md

Lines changed: 82 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -8,12 +8,87 @@ two projects in a monorepo have conflicting definitions (but we want to analyze
88

99
In practice these tests cover what we call "desperate module resolution" which, when an import
1010
fails, results in us walking up the ancestor directories of the importing file and trying those as
11-
"desperate search-paths".
11+
"desperate search-paths" until one works.
1212

1313
Currently desperate search-paths are restricted to subdirectories of the first-party search-path
14-
(the directory you're running `ty` in). Currently we only consider one desperate search-path: the
15-
closest ancestor directory containing a `pyproject.toml`. In the future we may want to try every
16-
ancestor `pyproject.toml` or every ancestor directory.
14+
(typically, the directory you're running `ty` in).
15+
16+
There are two styles of desperate search-path we consider: "absolute" and "relative". Absolute
17+
desperate search-paths are used for resolving absolute imports (`import a.b.c`) while relative
18+
desperate search-paths are used for resolving relative imports (`from .c import x`).
19+
20+
Only the closest directory that contains either a `pyproject.toml` or `ty.toml` is a valid relative
21+
desperate search-path.
22+
23+
All ancestor directories that *do not* contain an `__init__.py(i)` are valid absolute desperate
24+
search-paths.
25+
26+
(Distracting detail: to ensure relative desperate search-paths are always valid absolute desperate
27+
search-paths, a directory that contains an `__init__.py(i)` *and* either a `pyproject.toml` or
28+
`ty.toml` is also a valid absolute search-path, but this shouldn't matter in practice, as you do not
29+
typically have those two kinds of file in the same directory.)
30+
31+
## Relative Desperate Search-Paths
32+
33+
We do not directly resolve relative imports. Instead we have a two-phase process:
34+
35+
1. Convert the relative module name `.c` to an absolute one `a.b.c`
36+
1. Resolve the absolute import `a.b.c`
37+
38+
(This allows us to transparently handle packaging semantics that mandate separate directories should
39+
be "logically combined" into a single directory, like namespace packages and stub packages.)
40+
41+
Relative desperate search-paths only appear in step 1, where we compute the module name of the
42+
importing file as the first step in resolving `.` to an absolute module name.
43+
44+
In practice, relative desperate search-paths are rarely needed because it usually doesn't matter if
45+
we think `.` is `a.b` or `b` when resolving `.c`: the fact that we computed `a.b` using our
46+
search-paths means `a.b.c` is what will resolve with those search-paths!
47+
48+
There are three caveats to this:
49+
50+
- If the module name we compute is *too short* then too many relative levels will fail to resolve
51+
(`..c` resolves in `a.b` but not `b`).
52+
- If the module name is *too long* then we may encounter directories that aren't valid module names,
53+
and reject the import (`my-proj.a.b.c` is not a valid module name).
54+
- Sloppiness will break relative imports in any kind of packaging situation where different
55+
directories are supposed to be "logically combined".
56+
57+
The fact that we restrict desperate resolution to the first-party search-path ("the project you're
58+
working on") allows us to largely dismiss the last concern for the purposes of this discussion. The
59+
remaining two concerns encourage us to find "the longest possible module name without stumbling into
60+
random nonsense directories". When we need relative desperate search-paths we are usually running
61+
into the "too long" problem and "snap to the parent `pyproject.toml` (or `ty.toml`)" tends to
62+
resolve it well!
63+
64+
As a more aesthetic concern, this approach also ensures that all the files under a given
65+
`pyproject.toml` will, when faced with desperation, agree on eachother's relative module names. This
66+
may or may not be important, but it's definitely *reassuring* and *satisfying*!
67+
68+
## Absolute Desperate Search-Paths
69+
70+
Absolute desperate search-paths are much more load-bearing, because if we're handed the absolute
71+
import `a.b.c` then there is only one possible search-path that will properly resolve this the way
72+
the user wants, and if that search-path isn't configured we will fail.
73+
74+
Basic heuristics like checking for `<working-dir>/src/` and resolving editables in the local `.venv`
75+
work well in most cases, but desperate resolution is needed in a couple key scenarios:
76+
77+
- Test or script directories have a tendency to assume extra search-paths that aren't structurally
78+
obvious ([notably pytest](https://docs.pytest.org/en/stable/explanation/pythonpath.html))
79+
- If you open the root of a monorepo in an IDE, you will often have many separate projects but no
80+
configuration explaining this. Absolute imports within each project should resolve things in
81+
that project.
82+
83+
The latter case is often handled reasonably well by the the `pyproject.toml` rule that relative
84+
desperate search-paths have. However the more complex testing/scripting scenarios tend to fall over
85+
here -- in the limit pytest will add literally every ancestor to the search-path, and so we simply
86+
need to try every single one and hope *one* works for every absolute import (and it might be a
87+
different one for different imports).
88+
89+
We exclude directories that contain an `__init__.py(i)` because there shouldn't be any reasonable
90+
scenario where we need to "truncate" a regular package like that (and pytest's Exciting behaviour
91+
here is explicitly disabled by `__init__.py`).
1792

1893
## Invalid Names
1994

@@ -134,13 +209,11 @@ from .mod1 import x
134209

135210
# error: [unresolved-import]
136211
from . import mod2
137-
138-
# error: [unresolved-import]
139212
import mod3
140213

141214
reveal_type(x) # revealed: Unknown
142215
reveal_type(mod2.y) # revealed: Unknown
143-
reveal_type(mod3.z) # revealed: Unknown
216+
reveal_type(mod3.z) # revealed: int
144217
```
145218

146219
`my-proj/tests/mod1.py`:
@@ -338,21 +411,6 @@ create, and we are now very sensitive to precise search-path ordering.**
338411

339412
Here the use of editables means that `a/` has higher priority than `a/src/a/`.
340413

341-
Somehow this results in `a/tests/test1.py` being able to resolve `.setup` but not `.`.
342-
343-
My best guess is that in this state we can resolve regular modules in `a/tests/` but not namespace
344-
packages because we have some extra validation for namespace packages conflicted by regular
345-
packages, but that validation isn't applied when we successfully resolve a submodule of the
346-
namespace package.
347-
348-
In this case, as we find that `a/tests/test1.py` matches on the first-party path as `a.tests.test1`
349-
and is syntactically valid. We then resolve `a.tests.test1` and because the namespace package
350-
(`/a/`) comes first we succeed. We then syntactically compute `.` to be `a.tests`.
351-
352-
When we go to lookup `a.tests.setup`, whatever grace that allowed `a.tests.test1` to resolve still
353-
works so it resolves too. However when we try to resolve `a.tests` on its own some additional
354-
validation rejects the namespace package conflicting with the regular package.
355-
356414
```toml
357415
[environment]
358416
# Setup a venv with editables for a/src/ and b/src/
@@ -385,17 +443,13 @@ b/src/
385443
`a/tests/test1.py`:
386444

387445
```py
388-
# TODO: there should be no errors in this file.
389-
390446
from .setup import x
391-
392-
# error: [unresolved-import]
393447
from . import setup
394448
from a import y
395449
import a
396450

397451
reveal_type(x) # revealed: int
398-
reveal_type(setup.x) # revealed: Unknown
452+
reveal_type(setup.x) # revealed: int
399453
reveal_type(y) # revealed: int
400454
reveal_type(a.y) # revealed: int
401455
```
@@ -422,17 +476,13 @@ y: int = 10
422476
`b/tests/test1.py`:
423477

424478
```py
425-
# TODO: there should be no errors in this file
426-
427479
from .setup import x
428-
429-
# error: [unresolved-import]
430480
from . import setup
431481
from b import y
432482
import b
433483

434484
reveal_type(x) # revealed: str
435-
reveal_type(setup.x) # revealed: Unknown
485+
reveal_type(setup.x) # revealed: str
436486
reveal_type(y) # revealed: str
437487
reveal_type(b.y) # revealed: str
438488
```

crates/ty_python_semantic/src/module_resolver/resolver.rs

Lines changed: 89 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -336,7 +336,14 @@ pub(crate) fn file_to_module(db: &dyn Db, file: File) -> Option<Module<'_>> {
336336
path,
337337
search_paths(db, ModuleResolveMode::StubsAllowed),
338338
)
339-
.or_else(|| file_to_module_impl(db, file, path, desperate_search_paths(db, file).iter()))
339+
.or_else(|| {
340+
file_to_module_impl(
341+
db,
342+
file,
343+
path,
344+
relative_desperate_search_paths(db, file).iter(),
345+
)
346+
})
340347
}
341348

342349
fn file_to_module_impl<'db, 'a>(
@@ -388,11 +395,81 @@ pub(crate) fn search_paths(db: &dyn Db, resolve_mode: ModuleResolveMode) -> Sear
388395
Program::get(db).search_paths(db).iter(db, resolve_mode)
389396
}
390397

391-
/// Get the search-paths that should be used for desperate resolution of imports in this file
398+
/// Get the search-paths for desperate resolution of absolute imports in this file.
399+
///
400+
/// Currently this is "all ancestor directories that don't contain an `__init__.py(i)`"
401+
/// (from closest-to-importing-file to farthest).
402+
///
403+
/// (For paranoia purposes, all relative desperate search-paths are also absolute
404+
/// valid desperate search-paths, but don't worry about that.)
405+
///
406+
/// We exclude `__init__.py(i)` dirs to avoid truncating packages.
407+
#[salsa::tracked(heap_size=ruff_memory_usage::heap_size)]
408+
fn absolute_desperate_search_paths(db: &dyn Db, importing_file: File) -> Option<Vec<SearchPath>> {
409+
let system = db.system();
410+
let importing_path = importing_file.path(db).as_system_path()?;
411+
412+
// Only allow this if the importing_file is under the first-party search path
413+
let (base_path, rel_path) =
414+
search_paths(db, ModuleResolveMode::StubsAllowed).find_map(|search_path| {
415+
if !search_path.is_first_party() {
416+
return None;
417+
}
418+
Some((
419+
search_path.as_system_path()?,
420+
search_path.relativize_system_path_only(importing_path)?,
421+
))
422+
})?;
423+
424+
// Read the revision on the corresponding file root to
425+
// register an explicit dependency on this directory. When
426+
// the revision gets bumped, the cache that Salsa creates
427+
// for this routine will be invalidated.
428+
//
429+
// (This is conditional because ruff uses this code too and doesn't set roots)
430+
if let Some(root) = db.files().root(db, base_path) {
431+
let _ = root.revision(db);
432+
}
433+
434+
// Only allow searching up to the first-party path's root
435+
let mut search_paths = Vec::new();
436+
for rel_dir in rel_path.ancestors() {
437+
let candidate_path = base_path.join(rel_dir);
438+
if !system.is_directory(&candidate_path) {
439+
continue;
440+
}
441+
// Any dir that isn't a proper package is plausibly some test/script dir that could be
442+
// added as a search-path at runtime. Notably this reflects pytest's default mode where
443+
// it adds every dir with a .py to the search-paths (making all test files root modules),
444+
// unless they see an `__init__.py`, in which case they assume you don't want that.
445+
let isnt_regular_package = !system.is_file(&candidate_path.join("__init__.py"))
446+
&& !system.is_file(&candidate_path.join("__init__.pyi"));
447+
// Any dir with a pyproject.toml or ty.toml is a valid relative desperate search-path and
448+
// we want all of those to also be valid absolute desperate search-paths. It doesn't
449+
// make any sense for a folder to have `pyproject.toml` and `__init__.py` but let's
450+
// not let something cursed and spooky happen, ok? d
451+
if isnt_regular_package
452+
|| system.is_file(&candidate_path.join("pyproject.toml"))
453+
|| system.is_file(&candidate_path.join("ty.toml"))
454+
{
455+
let search_path = SearchPath::first_party(system, candidate_path).ok()?;
456+
search_paths.push(search_path);
457+
}
458+
}
459+
460+
if search_paths.is_empty() {
461+
None
462+
} else {
463+
Some(search_paths)
464+
}
465+
}
466+
467+
/// Get the search-paths for desperate resolution of relative imports in this file.
392468
///
393-
/// Currently this is "the closest ancestor dir that contains a pyproject.toml", which is
394-
/// a completely arbitrary decision. We could potentially change this to return an iterator
395-
/// of every ancestor with a pyproject.toml or every ancestor.
469+
/// Currently this is "the closest ancestor dir that contains a pyproject.toml (or ty.toml)",
470+
/// which is a completely arbitrary decision. However it's farily important that relative
471+
/// desperate search-paths pick a single "best" answer because every one is *valid* but one
472+
/// that's too long or too short may cause problems.
396473
///
397474
/// For now this works well in common cases where we have some larger workspace that contains
398475
/// one or more python projects in sub-directories, and those python projects assume that
@@ -402,7 +479,7 @@ pub(crate) fn search_paths(db: &dyn Db, resolve_mode: ModuleResolveMode) -> Sear
402479
/// chaotic things. In particular, all files under a given pyproject.toml will currently
403480
/// agree on this being their desperate search-path, which is really nice.
404481
#[salsa::tracked(heap_size=ruff_memory_usage::heap_size)]
405-
fn desperate_search_paths(db: &dyn Db, importing_file: File) -> Option<SearchPath> {
482+
fn relative_desperate_search_paths(db: &dyn Db, importing_file: File) -> Option<SearchPath> {
406483
let system = db.system();
407484
let importing_path = importing_file.path(db).as_system_path()?;
408485

@@ -431,13 +508,15 @@ fn desperate_search_paths(db: &dyn Db, importing_file: File) -> Option<SearchPat
431508
// Only allow searching up to the first-party path's root
432509
for rel_dir in rel_path.ancestors() {
433510
let candidate_path = base_path.join(rel_dir);
434-
if system.path_exists(&candidate_path.join("pyproject.toml"))
435-
|| system.path_exists(&candidate_path.join("ty.toml"))
511+
// Any dir with a pyproject.toml or ty.toml might be a project root
512+
if system.is_file(&candidate_path.join("pyproject.toml"))
513+
|| system.is_file(&candidate_path.join("ty.toml"))
436514
{
437515
let search_path = SearchPath::first_party(system, candidate_path).ok()?;
438516
return Some(search_path);
439517
}
440518
}
519+
441520
None
442521
}
443522
#[derive(Clone, Debug, PartialEq, Eq, get_size2::GetSize)]
@@ -960,8 +1039,8 @@ fn desperately_resolve_name(
9601039
name: &ModuleName,
9611040
mode: ModuleResolveMode,
9621041
) -> Option<ResolvedName> {
963-
let search_paths = desperate_search_paths(db, importing_file);
964-
resolve_name_impl(db, name, mode, search_paths.iter())
1042+
let search_paths = absolute_desperate_search_paths(db, importing_file);
1043+
resolve_name_impl(db, name, mode, search_paths.iter().flatten())
9651044
}
9661045

9671046
fn resolve_name_impl<'a>(

0 commit comments

Comments
 (0)