Speed up gettext: linear template merge, single POT read, plural-info dedup, and runtime interpolation fast paths by oliver-kriska · Pull Request #2 · oliver-kriska/gettext

oliver-kriska · 2026-06-16T07:15:49Z

Five small, output-preserving optimizations across the extraction/compile path and the runtime interpolation path, stacked on top of the --from-attributes work (extract-from-attributes). Each change is output-equivalent: the extraction changes (1-3) were verified to produce byte-identical PO/POT output against a production app's catalog (~1,875 files, 11 domains × 16 locales, ~4,700 msgids), and the runtime changes (4-5) are covered by the existing interpolation test suite.

This PR targets the fork's extract-from-attributes branch so the diff shows only the new commits. Opened to review the diff and dogfood it in CI/dev before considering an upstream PR.

Commits

Make POT template merge linear instead of O(n*m) - merge_template/3 matched messages with Expo.Messages.find/2 (a linear scan) once per existing message and once per new message, so it was O(N*M) per domain on every extract (and again in prune_unmerged/2). It now indexes both sides by Expo.Message.key/1 and looks matches up in a map / MapSet. Message.key/1 is exactly what Message.same?/2 (hence Messages.find/2) compares on, and Map.put_new/3 preserves Enum.find/2's first-match-wins, so output is unchanged.
Avoid reading each existing POT file twice - read_contents_and_parse/1 did File.read!/1 and then PO.parse_file!/2, which reads the same file again. PO.parse_file!(path, file: path) is defined as File.read + parse_string(contents, file: path), so feeding the bytes we already hold to PO.parse_string!/2 is identical; File.read!/1 still guards a missing file.
Compute plural info once per PO file - compile_po_file/5 called Plural.plural_info/3 twice (in nplurals/3 and compile_plural_forms/4). That function runs Code.ensure_compiled!/1 and parses the Plural-Forms header. It is computed once now; the generated code is identical.
Stop recompiling the interpolation patterns on every call - to_interpolatable/1 rebuilt :binary.compile_pattern/1 for "%{" and "}" on every invocation. This runs on every runtime interpolation, including the common case where the current locale has no translation for a msgid and falls back to interpolating the msgid itself. It now matches on the literal "%{" / "}" patterns directly in :binary.split/2; the split result is identical, and these two-byte patterns gain nothing from precompilation (and a compiled pattern can't be hoisted to a module attribute - it's a reference).
Skip String.Chars dispatch for binary bindings - the runtime interpolation path always ran binding values through to_string/1, dispatching the String.Chars protocol even when the value is already a binary (the common case). An is_binary/1 clause now uses the value directly; to_string/1 on a binary returns it unchanged, so the result is identical.

Benchmark (commit 1)

The merge runs in both the stock and --from-attributes paths, so this win applies regardless of the flag. This script isolates the matching cost (the merge step is reduced to identity so both implementations do the same number of calls) and asserts identical result counts. Run it with mix run, optionally setting BENCH_POT to a real .pot/.po path to add a real-catalog case:

alias Expo.{Message, Messages, PO}

defmodule OldMerge do
  def run(existing, new) do
    old_and_merged =
      Enum.flat_map(existing.messages, fn message ->
        cond do
          same = Messages.find(new, message) -> [keep(message, same)]
          true -> [message]
        end
      end)

    old_and_merged ++ Enum.reject(new.messages, &Messages.find(existing, &1))
  end

  defp keep(old, _new), do: old
end

defmodule NewMerge do
  def run(existing, new) do
    new_by_key =
      Enum.reduce(new.messages, %{}, fn message, acc ->
        Map.put_new(acc, Message.key(message), message)
      end)

    existing_keys = MapSet.new(existing.messages, &Message.key/1)

    old_and_merged =
      Enum.flat_map(existing.messages, fn message ->
        cond do
          same = Map.get(new_by_key, Message.key(message)) -> [keep(message, same)]
          true -> [message]
        end
      end)

    old_and_merged ++
      Enum.reject(new.messages, &MapSet.member?(existing_keys, Message.key(&1)))
  end

  defp keep(old, _new), do: old
end

time = fn iters, fun ->
  fun.()
  {us, _} = :timer.tc(fn -> Enum.each(1..iters, fn _ -> fun.() end) end)
  us / iters / 1000.0
end

synthetic = fn n ->
  %Messages{messages: for(i <- 1..n, do: %Message.Singular{msgid: ["msgid number #{i}"], msgstr: [""]})}
end

bench = fn label, existing, new, iters ->
  old_ms = time.(iters, fn -> OldMerge.run(existing, new) end)
  new_ms = time.(iters, fn -> NewMerge.run(existing, new) end)
  equal? = length(OldMerge.run(existing, new)) == length(NewMerge.run(existing, new))
  IO.puts("#{String.pad_trailing(label, 22)} old=#{Float.round(old_ms, 2)} ms  " <>
            "new=#{Float.round(new_ms, 3)} ms  equal_count=#{equal?}")
end

case System.get_env("BENCH_POT") do
  nil -> :ok
  path ->
    pot = PO.parse_file!(Path.expand(path))
    bench.("real (all match)", pot, pot, 20)
    fresh = %Messages{messages: for(m <- pot.messages, do: %{m | msgid: ["NEW " | m.msgid]})}
    bench.("real (all new)", pot, fresh, 20)
end

for n <- [500, 1000, 2000, 4000, 8000], do: bench.("n=#{n}", synthetic.(n), synthetic.(n), max(3, div(40_000, n)))

Results on a real default.pot (~4,200 msgids), 3 fresh-VM runs, warmup + 20-iteration average, equal_count true throughout:

input	old	new
real catalog, all match (steady-state re-extract)	~1.0 s	~3 ms
real catalog, all new (worst case)	~3.7 s	~3 ms
synthetic n=500 / 1k / 2k / 4k / 8k	14 / 51 / 230 / 800 / 3235 ms	0.23 / 0.46 / 1.0 / 2.1 / 4.3 ms

The old time quadruples per doubling (O(n²)); the new one only doubles (O(n)). End-to-end, the stock (no-flag) extract is dominated by the force-recompile, so the merge saving is small there in absolute terms but grows with catalog size; on the recompile-free --from-attributes path it is proportionally visible.

Validation

Full mix test is green (the only failures are the pre-existing order-dependent gettext.extract_test.exs cases under Elixir 1.20+, present on the base branch as well).
Against a large production app: it recompiles cleanly through the changed codegen, and mix gettext.extract + mix gettext.merge leave the committed priv/gettext tree byte-identical and idempotent.

stage-review · 2026-06-16T07:16:03Z

Ready to review this PR? Stage has broken it down into 5 individual chapters for you:

	Title
1	Optimize POT template merge to linear time
2	Avoid redundant file reads during extraction
3	Deduplicate plural info computation during compilation
4	Optimize runtime interpolation pattern matching
5	Fast-path binary values in interpolation

_{Chapters generated by Stage for commit 574dcba on Jun 16, 2026 3:47pm UTC.}

`Gettext.Extractor.merge_template/3` matched messages between the existing and newly extracted templates with `Expo.Messages.find/2`, a linear scan run once per existing message and once per new message. For a domain with N existing and M extracted messages this is O(N*M) comparisons, and it runs on every `mix gettext.extract` (and again in `prune_unmerged/2`), dominating extraction time on large catalogs. Index both sides by `Expo.Message.key/1` first, then look matches up in a map / `MapSet`. `Message.key/1` is exactly what `Message.same?/2` (and thus `Messages.find/2`) compares on, so the result is unchanged; `Map.put_new/3` preserves the first-match-wins behavior of `Enum.find/2`. Output is byte-identical, verified against a large production app's catalog.

`read_contents_and_parse/1` read the file with `File.read!/1` and then called `PO.parse_file!/2`, which reads the very same file again internally. Pass the contents we already have to `PO.parse_string!/2` instead. `PO.parse_file!(path, file: path)` is defined as `File.read` followed by `parse_string(contents, file: path)`, so the parsed result is identical; the existing `File.read!/1` still raises on a missing file. This removes one disk read per existing POT file on every `mix gettext.extract`.

`compile_po_file/5` derived the plural data twice: once in `nplurals/3` and once in `compile_plural_forms/4`, each calling `Plural.plural_info/3`. That function runs `Code.ensure_compiled!/1` and parses the `Plural-Forms` header, so it is needless work repeated for every PO file across all locales and domains. Compute `plural_info` once in `compile_po_file/5` and pass it to both helpers. The generated code is identical (the same value is escaped into the plural dispatcher and fed to `nplurals/1`); only the duplicate computation is removed.

`to_interpolatable/1` called `:binary.compile_pattern/1` for both `"%{"` and `"}"` on every invocation, then threaded the results through the recursion. This runs on every runtime interpolation, including the common case where the current locale has no translation for a msgid and falls back to interpolating the msgid itself. Match on the literal `"%{"` / `"}"` patterns directly in `:binary.split/2` instead. The split result is identical, and these two-byte patterns gain nothing from being precompiled, so this just removes the per-call work (a compiled pattern can't be hoisted to a module attribute - it's a reference).

In the runtime interpolation path, binding values were always run through `to_string/1`, dispatching the `String.Chars` protocol even when the value is already a binary - the common case (names, labels, preformatted strings). Add an `is_binary(value)` clause that uses the value directly. `to_string/1` on a binary returns it unchanged, so the result is identical; this only skips the protocol dispatch.

oliver-kriska force-pushed the extraction-merge-perf branch from d0d2e15 to 4e11de9 Compare June 16, 2026 08:14

oliver-kriska changed the title ~~Extraction/merge compile-time perf bundle (linear merge, single read, plural dedup)~~ Speed up mix gettext.extract: linear template merge, single POT read, one plural-info computation Jun 16, 2026

oliver-kriska force-pushed the extraction-merge-perf branch from 4e11de9 to 0eb3655 Compare June 16, 2026 08:26

oliver-kriska changed the title ~~Speed up mix gettext.extract: linear template merge, single POT read, one plural-info computation~~ Speed up gettext: linear template merge, single POT read, plural-info dedup, and runtime interpolation fast paths Jun 16, 2026

oliver-kriska force-pushed the extract-from-attributes branch from d27c40c to abd4bf7 Compare June 16, 2026 12:58

oliver-kriska force-pushed the extraction-merge-perf branch from 6ae2799 to bc57759 Compare June 16, 2026 13:07

oliver-kriska force-pushed the extract-from-attributes branch from abd4bf7 to 11ae3cd Compare June 16, 2026 15:33

oliver-kriska force-pushed the extraction-merge-perf branch from bc57759 to 066379a Compare June 16, 2026 15:33

oliver-kriska added 5 commits June 16, 2026 17:46

oliver-kriska force-pushed the extract-from-attributes branch from 11ae3cd to c357fca Compare June 16, 2026 15:46

oliver-kriska force-pushed the extraction-merge-perf branch from 066379a to 574dcba Compare June 16, 2026 15:46

oliver-kriska mentioned this pull request Jun 16, 2026

Add recompile-free extraction via persisted module attributes elixir-gettext/gettext#437

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up gettext: linear template merge, single POT read, plural-info dedup, and runtime interpolation fast paths#2

Speed up gettext: linear template merge, single POT read, plural-info dedup, and runtime interpolation fast paths#2
oliver-kriska wants to merge 5 commits into
extract-from-attributesfrom
extraction-merge-perf

oliver-kriska commented Jun 16, 2026 •

edited

Loading

Uh oh!

stage-review Bot commented Jun 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

oliver-kriska commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Commits

Benchmark (commit 1)

Validation

Uh oh!

stage-review Bot commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

oliver-kriska commented Jun 16, 2026 •

edited

Loading

stage-review Bot commented Jun 16, 2026 •

edited

Loading