TOML: structured DottedKey AST and path-lookup utilities#7538
Draft
knutwannheden wants to merge 1 commit intomainfrom
Draft
TOML: structured DottedKey AST and path-lookup utilities#7538knutwannheden wants to merge 1 commit intomainfrom
knutwannheden wants to merge 1 commit intomainfrom
Conversation
A dotted TOML key like `physical.color` was previously flattened into a single `Toml.Identifier` whose `name` was the joined string of all child tokens. That representation cannot distinguish `site."google.com"` (two segments, the second containing a literal dot) from `site.google.com` (three bare segments) — both became the string `site.google.com`. Recipes wanting to find or modify a value by logical key path each ended up doing ad-hoc traversal that handled only some of the equivalent authoring forms. Add `Toml.DottedKey implements TomlKey` with an ordered list of `Toml.Identifier` segments wrapped in `TomlRightPadded`. Each segment preserves its own prefix/source for round-tripping, and the right-padding holds the whitespace before the following dot. The dots themselves are emitted by the printer between segments rather than stored. `Toml.Table.name` widens from `TomlRightPadded<Toml.Identifier>` to `TomlRightPadded<TomlKey>` so headers can carry either shape. `TomlKey` gains a `getPath()` default returning the canonical list of unquoted segment names — singleton for a simple `Identifier`, N-element for a `DottedKey`. A `getName()` default returns those segments joined with `.`, matching the existing `Identifier.getName()` semantics so consumers that compare names as strings keep working unchanged. `TomlPaths` is a new static utility offering `findKeyValue` and `findTable` over a `Document`. The finder walks the document and matches a target path regardless of whether the document expressed it as a flat dotted key (`a.b.c.x = 1`), nested headers (`[a] [a.b] [a.b.c] x = 1`), `[a.b.c] x = 1`, `[a.b] c.x = 1`, or nested inline tables. Quoted segments containing literal dots are treated as one segment. Also: `TomlVisitor.visitTable` now visits the table name so subclasses that transform identifiers/dotted keys see headers as well as key-value keys; previously the name was silently skipped. `SemanticallyEqual.keyEquals` and `TomlPathMatcher` are simplified to use `getPath()` directly. `PythonDependencyParser.indexTables` is adjusted so dotted-header tables (e.g. `[tool.uv]`) keep being indexed.
8e2ef24 to
83532b4
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
A dotted TOML key like
physical.colorwas previously flattened into a singleToml.Identifierwhosenamewas the joined string of all child tokens. That representation cannot distinguishsite."google.com"(two segments, the second containing a literal dot) fromsite.google.com(three bare segments) — both became the stringsite.google.com.The same logical key in TOML can be written in many equivalent ways:
Recipes wanting to find or modify a value at a logical key path each ended up doing ad-hoc traversal that handled only some of these forms.
Examples
A new
Toml.DottedKeyAST node carries an ordered list of segments:Quoted segments containing literal dots stay one segment:
A new
TomlPathsstatic utility resolves a logical path to aKeyValue(orTable) regardless of authoring form:Summary
Toml.DottedKey implements TomlKeywithList<TomlRightPadded<Toml.Identifier>>segments. Each segment preserves its own prefix/source for round-tripping; right-padding holds the whitespace before the next dot. Dots are emitted by the printer between segments and not stored.Toml.Table.namefromTomlRightPadded<Toml.Identifier>toTomlRightPadded<TomlKey>so headers can carry either shape.TomlKey#getPath()returning the canonical unquoted segment list (length 1 forIdentifier, N forDottedKey), andTomlKey#getName()returning the dot-joined form. The latter matches existingIdentifier.getName()semantics so consumers comparing names as strings keep working unchanged.org.openrewrite.toml.TomlPathsstatic utility withfindKeyValueandfindTable.TomlVisitor.visitTablenow visits the table name (previously the name was silently skipped, breaking visitor-based transformations targeting headers).SemanticallyEqual.keyEqualsandTomlPathMatcher.buildPathsimplified to usegetPath().PythonDependencyParser.indexTablesadjusted so dotted-header tables (e.g.[tool.uv]) keep being indexed under their joined name.Builds on the simple-key strip-quotes work landed in #7521.
Test plan
TomlParserTest.dottedKeysextended with structural assertions distinguishingsite."google.com"(2 segments) fromsite.google.com(3 segments).TomlParserTest.extraWhitespaceTableextended with structural assertion that[a.b.c]and[ j . "ʞ" . 'l' ]produce 3-segmentDottedKey.TomlPathsTestcovering each equivalent authoring form resolving to the same path, quoted-segment-with-dot semantics, missing-path returns null, and array tables not searched.:rewrite-toml:testtests pass (round-trip preservation for every fixture).:rewrite-python:testpasses for all TOML-touching tests.