Speed up gettext: linear template merge, single POT read, plural-info dedup, and runtime interpolation fast paths#2
Open
oliver-kriska wants to merge 5 commits into
Conversation
|
Ready to review this PR? Stage has broken it down into 5 individual chapters for you: Chapters generated by Stage for commit 574dcba on Jun 16, 2026 3:47pm UTC. |
d0d2e15 to
4e11de9
Compare
4e11de9 to
0eb3655
Compare
d27c40c to
abd4bf7
Compare
6ae2799 to
bc57759
Compare
abd4bf7 to
11ae3cd
Compare
bc57759 to
066379a
Compare
`Gettext.Extractor.merge_template/3` matched messages between the existing and newly extracted templates with `Expo.Messages.find/2`, a linear scan run once per existing message and once per new message. For a domain with N existing and M extracted messages this is O(N*M) comparisons, and it runs on every `mix gettext.extract` (and again in `prune_unmerged/2`), dominating extraction time on large catalogs. Index both sides by `Expo.Message.key/1` first, then look matches up in a map / `MapSet`. `Message.key/1` is exactly what `Message.same?/2` (and thus `Messages.find/2`) compares on, so the result is unchanged; `Map.put_new/3` preserves the first-match-wins behavior of `Enum.find/2`. Output is byte-identical, verified against a large production app's catalog.
`read_contents_and_parse/1` read the file with `File.read!/1` and then called `PO.parse_file!/2`, which reads the very same file again internally. Pass the contents we already have to `PO.parse_string!/2` instead. `PO.parse_file!(path, file: path)` is defined as `File.read` followed by `parse_string(contents, file: path)`, so the parsed result is identical; the existing `File.read!/1` still raises on a missing file. This removes one disk read per existing POT file on every `mix gettext.extract`.
`compile_po_file/5` derived the plural data twice: once in `nplurals/3` and once in `compile_plural_forms/4`, each calling `Plural.plural_info/3`. That function runs `Code.ensure_compiled!/1` and parses the `Plural-Forms` header, so it is needless work repeated for every PO file across all locales and domains. Compute `plural_info` once in `compile_po_file/5` and pass it to both helpers. The generated code is identical (the same value is escaped into the plural dispatcher and fed to `nplurals/1`); only the duplicate computation is removed.
`to_interpolatable/1` called `:binary.compile_pattern/1` for both `"%{"` and
`"}"` on every invocation, then threaded the results through the recursion.
This runs on every runtime interpolation, including the common case where the
current locale has no translation for a msgid and falls back to interpolating
the msgid itself.
Match on the literal `"%{"` / `"}"` patterns directly in `:binary.split/2`
instead. The split result is identical, and these two-byte patterns gain
nothing from being precompiled, so this just removes the per-call work (a
compiled pattern can't be hoisted to a module attribute - it's a reference).
In the runtime interpolation path, binding values were always run through `to_string/1`, dispatching the `String.Chars` protocol even when the value is already a binary - the common case (names, labels, preformatted strings). Add an `is_binary(value)` clause that uses the value directly. `to_string/1` on a binary returns it unchanged, so the result is identical; this only skips the protocol dispatch.
11ae3cd to
c357fca
Compare
066379a to
574dcba
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Five small, output-preserving optimizations across the extraction/compile path and the runtime interpolation path, stacked on top of the
--from-attributeswork (extract-from-attributes). Each change is output-equivalent: the extraction changes (1-3) were verified to produce byte-identical PO/POT output against a production app's catalog (~1,875 files, 11 domains × 16 locales, ~4,700 msgids), and the runtime changes (4-5) are covered by the existing interpolation test suite.This PR targets the fork's
extract-from-attributesbranch so the diff shows only the new commits. Opened to review the diff and dogfood it in CI/dev before considering an upstream PR.Commits
Make POT template merge linear instead of O(n*m) -
merge_template/3matched messages withExpo.Messages.find/2(a linear scan) once per existing message and once per new message, so it was O(N*M) per domain on every extract (and again inprune_unmerged/2). It now indexes both sides byExpo.Message.key/1and looks matches up in a map /MapSet.Message.key/1is exactly whatMessage.same?/2(henceMessages.find/2) compares on, andMap.put_new/3preservesEnum.find/2's first-match-wins, so output is unchanged.Avoid reading each existing POT file twice -
read_contents_and_parse/1didFile.read!/1and thenPO.parse_file!/2, which reads the same file again.PO.parse_file!(path, file: path)is defined asFile.read+parse_string(contents, file: path), so feeding the bytes we already hold toPO.parse_string!/2is identical;File.read!/1still guards a missing file.Compute plural info once per PO file -
compile_po_file/5calledPlural.plural_info/3twice (innplurals/3andcompile_plural_forms/4). That function runsCode.ensure_compiled!/1and parses thePlural-Formsheader. It is computed once now; the generated code is identical.Stop recompiling the interpolation patterns on every call -
to_interpolatable/1rebuilt:binary.compile_pattern/1for"%{"and"}"on every invocation. This runs on every runtime interpolation, including the common case where the current locale has no translation for a msgid and falls back to interpolating the msgid itself. It now matches on the literal"%{"/"}"patterns directly in:binary.split/2; the split result is identical, and these two-byte patterns gain nothing from precompilation (and a compiled pattern can't be hoisted to a module attribute - it's a reference).Skip String.Chars dispatch for binary bindings - the runtime interpolation path always ran binding values through
to_string/1, dispatching theString.Charsprotocol even when the value is already a binary (the common case). Anis_binary/1clause now uses the value directly;to_string/1on a binary returns it unchanged, so the result is identical.Benchmark (commit 1)
The merge runs in both the stock and
--from-attributespaths, so this win applies regardless of the flag. This script isolates the matching cost (the merge step is reduced to identity so both implementations do the same number of calls) and asserts identical result counts. Run it withmix run, optionally settingBENCH_POTto a real.pot/.popath to add a real-catalog case:Results on a real
default.pot(~4,200 msgids), 3 fresh-VM runs, warmup + 20-iteration average,equal_counttrue throughout:The old time quadruples per doubling (O(n²)); the new one only doubles (O(n)). End-to-end, the stock (no-flag) extract is dominated by the force-recompile, so the merge saving is small there in absolute terms but grows with catalog size; on the recompile-free
--from-attributespath it is proportionally visible.Validation
mix testis green (the only failures are the pre-existing order-dependentgettext.extract_test.exscases under Elixir 1.20+, present on the base branch as well).mix gettext.extract+mix gettext.mergeleave the committedpriv/gettexttree byte-identical and idempotent.