Stubtest: Improve heuristics for determining whether global-namespace names are imported#14270
Conversation
… names are imported
|
I went for code that felt "safe" and well-commented in this PR, since the approach felt a little out-there. But the code inside the -imported_symbols = _get_imported_symbol_names(runtime)
+try:
+ source = inspect.getsource(runtime)
+ module_symtable = symtable.symtable(source, runtime.__name__, "exec")
+ imported_symbols: set[str] | None = {
+ sym.get_name()
+ for sym in module_symtable.get_symbols()
+ if sym.is_imported()
+ }
+except (OSError, TypeError, SyntaxError):
+ imported_symbols = None |
hauntsaninja
left a comment
There was a problem hiding this comment.
Lol, this is interesting, cool idea! A new way for stubtest to blur the lines between dynamic and static.
| except SyntaxError: | ||
| return None | ||
|
|
||
| return frozenset(sym.get_name() for sym in module_symtable.get_symbols() if sym.is_imported()) |
There was a problem hiding this comment.
It looks like if you have an import and an assignment, is_imported is still True. I think this is probably desirable, given that we usually type objects as coming from their C accelerator modules even if there is a pure Python equivalent. So no action item here, just typing out my thoughts :-)
|
Thanks! :D |
The problem
Stubtest currently has both false-positives and false-negatives when it comes to verifying constants in the global namespace of a module.
False positive
The import of
string.ascii_lettersis an implementation detail, but stubtest will complain about it missing from the stub:False negative
IMPORTANT_REGEXis defined in the module, but stubtest will erroneously conclude (on Python 3.10+) that the constant has been imported from another module, due to the fact thatIMPORTANT_REGEX.__module__is"re":Solution
This PR fixes the false positive by using
inspect.getsourcelines()to dynamically retrieve the module source code. It then usessymtableto analyse that source code to gather a list of names which are known to be imported.The PR fixes the false negative by only using the
__module__heuristic on objects which are callable. The vast majority of callable objects will be types or functions. For these objects, the__module__attribute will give a good indication of whether the object originates from another module or not; for other objects, it's less useful.Impact on typeshed
This PR has the following impact on typeshed (all diffs are relative to stubtest as it exists on mypy
master, when run on Python 3.10.7; the stdlib diff is based on running stubtest on a Windows machine):Details
Stdlib:cffi:chevron:colorama:cronitercrontabdateparser:decorator:dockerfile-parse:fpdf:parsimoniouspikaplaysound:pyinstaller:pyscreeze:requests:retry:toml:vobject:Xlib:zxcvbn: