Skip to content

perf(rendering): batch node-label measurement + construction#7872

Draft
knsv-bot wants to merge 16 commits into
developfrom
perf/batch-label-measure
Draft

perf(rendering): batch node-label measurement + construction#7872
knsv-bot wants to merge 16 commits into
developfrom
perf/batch-label-measure

Conversation

@knsv-bot

Copy link
Copy Markdown
Collaborator

Summary

The steady-state (warm) win: batch the two most expensive per-node operations in
the measure phase — size measurement and label DOM construction — plus the
profiling that pinpointed them. No visual change (cypress snapshots clean).
Stacked on the surgical-reflow PR.

Validated results (dagre folder benchmark, trimmed mean)

baseline this PR gain
set measure 3010 ms 2491 ms −17%
huge1 measure 190 ms 104 ms −45%
huge1 total render 427 ms ~340 ms ~−20%

Flat / label-heavy flowcharts (the common case) gain −39 to −49% of measure.
Subgraph-heavy diagrams gain less — they're bound by nested dagre layout, not DOM.

Changes

  1. Batched measurement (prebuildNodeLabels in shapes/util.ts, wired into the
    common createGraphWithElements and dagre measureDagreGraph): build every label
    first (deferring its getBoundingClientRect), then read all sizes back-to-back so
    only the first forces a reflow.
  2. Batched construction (createText.ts htmlLabelMarkup + util.ts
    buildNodeLabelsBatched): assemble the markup for all HTML labels into one string
    and parse it with a single insertAdjacentHTML — the HTML parser builds the
    foreignObject/div/span sub-trees ~2× faster than per-element createElement
    (measured: labelBuild −33% set, −49% on flat diagrams).
    • Safety: a one-time probe + per-render structure check; if SVG-markup parsing
      ever misbehaves (e.g. jsdom), it falls back to per-element construction, so
      correctness never depends on the parser.
  3. Measure-phase profiling buckets (sanitize/markdown/decode/labelBuild)
    that decompose the phase — how each win above was found and proven. Guarded by
    injected.profiling, tree-shaken from production.

Note on the dagre→common-pipeline migration

This rebases cleanly onto that migration: the dagre measure span is dropped
(redundant — the common renderer's measure span now wraps measureDagreLayout),
while the dagre batching stays and runs inside that measure phase via the new
measureDagreGraph.

Testing

Unit tests pass (the probe + fallback keep jsdom green); cypress visual snapshots
clean
; zero new type errors.

🤖 Generated with Claude Code

knsv and others added 8 commits June 17, 2026 09:09
The per-node measure loop builds a label's DOM and immediately reads its size
(getBoundingClientRect / getBBox), interleaved across all nodes — so each read
forces a synchronous reflow over the growing tree. On a large dagre flowchart
this was ~2,216 Layout events / 173 ms (45% of the measure phase).

Split label handling into build -> measure -> finalize, and add
prebuildNodeLabels: build every label first (DOM writes only; the HTML path now
defers its internal getBoundingClientRect via createText's `deferMeasure`), then
read all sizes back-to-back so only the first read forces a reflow, then write
the positions. labelHelper returns these prebuilt labels; anything not prebuilt
(groups, linked nodes, shapes that build their own label) falls back to the
inline build+measure path unchanged, so this is purely an optimization.

Wired into createGraphWithElements (the common/ELK measure path). The dagre
recursiveRender path is a separate follow-up.

NOTE: jsdom returns 0 for getBoundingClientRect/getBBox, so unit tests cover
structure only — the actual reflow reduction and measured-size correctness must
be validated in real Chrome (cypress visual snapshots + the Dev Explorer
profiler).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Apply the same build-all-then-measure-all batching to the dagre layout path,
which measures nodes inside recursiveRender (its measureLayout hook is a no-op).
Before each recursion level's node loop, prebuild + batch-measure the level's
leaf-node labels; the loop's insertNode calls then consume the prebuilt labels,
collapsing the per-node label reflow to roughly one per recursion level.

clearPrebuiltLabels now takes the specific nodes a prebuild produced, so nested
concurrent subgraph renders clear only their own pending entries.

Same validation caveat as the common-path change: jsdom can't measure, so the
reflow reduction and size correctness need real-Chrome (cypress + profiler)
verification. This is the path the original trace profiled.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The warm benchmark shows getBBox/getBoundingClientRect (the reflow reads) are
only ~4% of the measure phase — the rest is unattributed DOM construction. Add
two buckets to find it:
 - sanitize: sanitizeMore + DOMPurify cost (per sanitizeText call).
 - labelBuild: the full label DOM construction (createText) in the prebuild loop.

Both guarded by injected.profiling. sanitize uses tickSync (sync, so accurate
under concurrency); labelBuild uses the async tick in the sequential prebuild
loop, where wall-clocks don't overlap. Diagnostic instrumentation to decide
where the warm measure cost actually is (DOMPurify vs DOM building vs shapes).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a `markdown` bucket around markdownToHTML/nonMarkdownToHTML so labelBuild
can be split into markdown parsing vs element creation:
  element-creation ≈ labelBuild − markdown − sanitize.
This decides whether a plain-text markdown fast-path is worthwhile or the cost
is in DOM element construction. tickSync (sync) so it's accurate under the
concurrent label builds.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add labelPrep (decodeEntities + replaceIconSubstring regex) and addHtmlSpan
(foreignObject/div/span DOM construction) buckets so the ~90%-of-labelBuild
"element-creation" cost can be attributed to regex prep vs actual DOM building —
deciding between a trivial plain-label fast-path and a DOM-construction rewrite.
Async tick, accurate in the sequential prebuild loop.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ync decode

The labelPrep/addHtmlSpan buckets used the async `tick`, which over-counts when
createText runs concurrently (recursion/clusters/fallback) — they reported
physically impossible values (150k+ ms). Drop them. decodeEntities is sync, so
measure it with tickSync (accurate under concurrency); replaceIconSubstring is a
no-op scan for plain labels (negligible). Confirms regex prep is not the cost —
element-creation is DOM construction in addHtmlSpan.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…HTML

Label element-creation (foreignObject/div/span per node) is ~50% of the warm
measure phase on large flowcharts, and a console A/B showed the HTML parser
builds these sub-trees ~2x faster than the DOM API. When every prebuilt label is
HTML, assemble the markup for all of them into one string and parse it with a
single insertAdjacentHTML instead of per-element createElement/setAttribute.

- createText: extract prepareHtmlLabelNode (shared text pipeline) and add
  htmlLabelMarkup (string form of addHtmlSpan, identical output) + escapeAttr +
  registerDeferredHtmlLabel.
- util.ts: prebuildNodeLabels routes through buildNodeLabelsBatched when a
  one-time probe confirms the environment parses batched SVG/xhtml markup with
  correct namespaces; otherwise (e.g. jsdom) falls back to per-element
  construction. A per-render structure check undoes and falls back if the parse
  is ever unexpected, so correctness never depends on the parser.

Validation: jsdom unit tests green via the fallback/probe; the batched path's
visual correctness must be confirmed by the cypress snapshot suite, and the win
by re-benchmarking huge1.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@changeset-bot

changeset-bot Bot commented Jun 17, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: 501fb49

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
mermaid Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@codecov

codecov Bot commented Jun 17, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 3.01205% with 322 lines in your changes missing coverage. Please review.
✅ Project coverage is 2.97%. Comparing base (35d0761) to head (501fb49).

Files with missing lines Patch % Lines
...c/rendering-util/rendering-elements/shapes/util.ts 0.00% 198 Missing ⚠️
packages/mermaid/src/rendering-util/createText.ts 0.00% 95 Missing ⚠️
...rc/rendering-util/layout-algorithms/dagre/index.js 0.00% 17 Missing ⚠️
packages/mermaid/src/rendering-util/createGraph.ts 0.00% 9 Missing ⚠️
packages/mermaid/src/diagrams/common/common.ts 75.00% 3 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           develop   #7872      +/-   ##
==========================================
+ Coverage     2.91%   2.97%   +0.05%     
==========================================
  Files          656     656              
  Lines        70441   70715     +274     
  Branches       979     982       +3     
==========================================
+ Hits          2055    2102      +47     
- Misses       68386   68613     +227     
Flag Coverage Δ
unit 2.97% <3.01%> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
packages/mermaid/src/diagram-api/regexes.ts 100.00% <100.00%> (ø)
packages/mermaid/src/diagrams/common/common.ts 32.20% <75.00%> (+0.30%) ⬆️
packages/mermaid/src/rendering-util/createGraph.ts 1.05% <0.00%> (-0.10%) ⬇️
...rc/rendering-util/layout-algorithms/dagre/index.js 0.14% <0.00%> (-0.01%) ⬇️
packages/mermaid/src/rendering-util/createText.ts 0.00% <0.00%> (ø)
...c/rendering-util/rendering-elements/shapes/util.ts 0.00% <0.00%> (ø)

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@argos-ci

argos-ci Bot commented Jun 17, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Argos notifications ↗︎

Build Status Details Updated (UTC)
default (Inspect) 👍 Changes approved 1 changed Jun 18, 2026, 10:59 AM

knsv and others added 3 commits June 17, 2026 15:09
# Conflicts:
#	packages/mermaid/src/rendering-util/rendering-elements/shapes/util.ts
… color)

prebuildNodeLabels builds and measures leaf-node labels up front, but icon/image
shape handlers set node.labelStyle (the classDef-derived label color), override
node.width, and pass a shape-specific label class ('icon-shape'/'image-shape') to
labelHelper *after* the prebuild pass. A prebuilt icon/image label was therefore
built and measured with the wrong class, stale style and stale width — dropping
the label's padding and its classDef color (e.g. white text rendered black).

Exclude node.icon/node.img from the prebuild filter so they fall back to the
inline build path inside their handler. Regular shapes keep the batched prebuild:
they pass getNodeClasses(node) and don't override width, matching what prebuild
uses, so they're unaffected.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- updateNodeBounds: guard against degenerate (zero/negative) knownBounds so it
  falls back to a real getBBox() measurement instead of committing the node to
  0×0.
- Downgrade the stray `log.warn('Returning from recursive render XAX', …)` debug
  trace to `log.debug` (the expensive 'Graph after XAX' graph-serialization log
  was already commented out).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Base automatically changed from perf/measure-phase-reflows to develop June 18, 2026 07:54
Two whitespace-related regex pathologies made a deeply-indented flowchart (e.g.
the perf fixture huge3, with 1.6k-space lines) cost ~1.4s to parse — both
O(whitespace²):

- anyCommentRegex (diagram-api/regexes.ts), used by detectType + preprocess: the
  leading greedy \s* + global flag re-scanned each indent run from every position.
  Guard the start with (?:^|(?<=\S)) so \s* is attempted once per run — O(n), and
  byte-identical output (new regexes.spec.ts asserts equivalence vs the legacy
  pattern over a corpus + an O(n) regression guard).
- flow.jison SPACE rule matched a single whitespace char (\s), emitting one SPACE
  token per space — hundreds of thousands on deep indentation. Match a run
  ([^\S\n\r]+) instead; the grammar only uses SPACE as a separator, so parsing is
  unchanged (full flowchart suite green, 991 tests).

huge3 Diagram.fromText: ~1410ms -> 27ms.
@netlify

netlify Bot commented Jun 18, 2026

Copy link
Copy Markdown

Deploy Preview for mermaid-js ready!

Name Link
🔨 Latest commit 501fb49
🔍 Latest deploy log https://app.netlify.com/projects/mermaid-js/deploys/6a33cc213dd3020008319298
😎 Deploy Preview https://deploy-preview-7872--mermaid-js.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
🤖 Make changes Run an agent on this branch

To edit notification comments on pull requests, go to your Netlify project configuration.

@pkg-pr-new

pkg-pr-new Bot commented Jun 18, 2026

Copy link
Copy Markdown

Open in StackBlitz

@mermaid-js/examples

npm i https://pkg.pr.new/@mermaid-js/examples@7872

mermaid

npm i https://pkg.pr.new/mermaid@7872

@mermaid-js/layout-elk

npm i https://pkg.pr.new/@mermaid-js/layout-elk@7872

@mermaid-js/layout-tidy-tree

npm i https://pkg.pr.new/@mermaid-js/layout-tidy-tree@7872

@mermaid-js/mermaid-zenuml

npm i https://pkg.pr.new/@mermaid-js/mermaid-zenuml@7872

@mermaid-js/parser

npm i https://pkg.pr.new/@mermaid-js/parser@7872

@mermaid-js/tiny

npm i https://pkg.pr.new/@mermaid-js/tiny@7872

commit: 501fb49

knsv and others added 3 commits June 18, 2026 10:18
…rebuild

prebuildNodeLabels builds and measures each leaf node's label up front, then
labelHelper hands that prebuilt label back to the shape. But several shapes
mutate the node *after* the prebuild pass runs:

- hourglass blanks node.label -> the prebuilt label still showed the text
- erBox rewrites node.label to the entity alias and bumps node.width -> the
  prebuilt label showed the id and was sized/positioned wrong (title missing)
- shapes apply a classDef-derived node.labelStyle via styles2String -> the
  prebuilt label was built with the un-styled labelStyle, dropping label colour

labelHelper now stamps each prebuilt label with a cheap signature of the node
state it was built from (label, width, markdown, labelStyle, html-labels,
centerLabel, icon/img) and only reuses it when the node still matches. When a
shape has customised the node the signature differs, so we drop the stale
prebuilt DOM and rebuild inline -- exactly the pre-perf path. Unchanged plain
nodes (the bulk of large diagrams) still take the fast prebuilt path.

Also run configureLabelImages on the batched markup build path: like the
per-element path it must wait for <img> labels to load and apply their sizing
styles before measuring, otherwise image nodes are mis-sized and the images
don't show.

Repro fixtures added under cypress/platform/dev-diagrams/render-regressions/.
A prebuilt leaf-node label is inserted into the nodes group during the prebuild
pass, before any shape runs. erBox draws a hand-drawn background rect as a
sibling g first and then builds the entity, relying on the entity's g being
appended after the background so it paints on top. When labelHelper reused a
prebuilt label in place, it stayed at its earlier position, so the background
painted over the entity and hand-drawn ER entities lost their background fill.

labelHelper now raises the reused g to the end of its parent, matching the
inline path (which appends the label at shape-render time). Short-named ER
entities were already correct because their width bump flips the label
signature and forces an inline rebuild; this covers the rest.

Repro fixture: er-handdrawn-entity-background.mmd.

@knsv-bot knsv-bot left a comment

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[sisyphus-bot]

Thanks for this — the build-all-then-measure-all batching is a genuinely smart attack on the per-node forced-reflow cost, and the engineering care shows. Since it's a draft, this is early-direction feedback, not a merge gate; nothing below is blocking. I read the full diff at head 501fb49, traced the new string-HTML path, ran an XSS pass, and read the regression-fixtures README. (Heads-up: the pre-fetch flagged this as a "new diagram" — false positive; it's a perf refactor, so the new-diagram checklist doesn't apply.)

What's working well

  • 🎉 The batching architecture is the right shape. Splitting labelHelper into buildNodeLabel (writes) → measureNodeLabel (the single forced read) → finalizeNodeLabel (writes), so a batch runs every read back-to-back, is exactly how you collapse the reflow storm. The validated numbers (−17% / −45% measure) back it up.
  • 🎉 The correctness guards are thoughtful, not hand-wavy. The labelSignature fingerprint to detect post-prebuild node mutation (hourglass blanking the label, erBox rewriting width, classDef-derived labelStyle) and fall back to the exact inline path; the canBatchSvgHtml() probe with per-element fallback so correctness never depends on the parser; clearPrebuiltLabels(nodes) scoped to avoid clobbering concurrent nested subgraph renders; the raise() to preserve paint order; and the degenerate-bounds guard in updateNodeBounds — these are the subtle pitfalls of caching+reordering DOM, and you've handled each. The commit history (icon/image padding, post-prebuild rebuild, paint order) shows they were found and fixed deliberately.
  • 🎉 regexes.spec.ts is a model equivalence test — keeping the legacy pattern as a byte-for-byte oracle over a whitespace-edge corpus plus a backtracking-perf assertion is exactly how to make a regex micro-opt reviewable.
  • 🎉 The render-regressions README is honest and useful — mapping each repro to the specific Cypress/Argos test that catches it is great traceability.

Things to consider (for when you take it out of draft)

🟡 "No visual change" vs. a folder of 8 known regressions — please confirm each is resolved on head

This is the whole correctness bar for the PR, so it's worth being explicit. The README documents 8 rendering regressions this perf work introduced (note text wrapping differently, hourglass rendering a label it shouldn't, ER entity titles vanishing, <img> icons not showing, box-label alignment flips, SVG markdown wrapping at different words). The commits fix several — but the README frames them as "compare against develop to confirm the regression," which reads as open. Before un-drafting, it'd be great to confirm every listed Argos test is green on head (especially imageshape-markdown-svg-wrap and class-notes-simple-alignment, which are wrap/alignment shifts the batched-markup path can reintroduce), and reword the README so it's clear these are fixed repros, not outstanding ones.

🟡 Scope: two unrelated perf concerns are bundled

The headline (label batching) and the flowchart parser O(n²) fix (flow.jison SPACE token + anyCommentRegex, with the flowchart-deep-indent-parse-perf changeset and regexes.spec.ts) are independent — different subsystem, different changeset, different tests. The parser fix is cleanly separable and would be much easier to land and bisect on its own. Worth splitting it into its own PR (it also looks ready/lower-risk than the batching, so it could land first).

🟡 The jison SPACE change needs the full flowchart parser suite green

\s → [^\S\n\r]+ (one token per whitespace run instead of per char) is a lexer grammar change. The "grammar only uses SPACE as a separator, so parsing is unchanged" claim is plausible, but jison changes are regression-prone and don't hot-reload (dev server restart needed). Please confirm the flowchart .spec suite passes. One edge to sanity-check: a lone \r not followed by \n used to match \s as SPACE and now matches nothing in that rule — almost certainly irrelevant, but worth a thought.

💡 Changeset coverage

The only changeset is for the bundled parser fix. The headline label-batching change has none — for a user-affecting perf improvement touching shared rendering-util/ I'd add a patch changeset (or, if you split per the point above, one per PR).

🟢 anyCommentRegex lookbehind

(?:^|(?<=\S)) uses a lookbehind — fine on current Node/browsers, just confirm it clears the project's minimum-supported-runtime matrix.

Security

Ran a dedicated XSS pass over the new string-built HTML path. No XSS or injection issues identified. htmlLabelMarkup routes all attribute values through escapeAttr (sufficient for the double-quoted attrs used) and all label content through sanitizeText/renderKatexSanitized — the same sanitization the inline addHtmlSpan path already applied (the "double sanitize" is pre-existing and harmless). The canBatchSvgHtml() probe verifies namespace resolution and bails to per-element construction otherwise, so insertAdjacentHTML adds no mXSS surface, and the final DOMPurify.sanitize in mermaidAPI is untouched. The sanitizeText refactor preserves the exact DOMPurify config branches.


Really strong work — the hard part (making batched DOM caching correct) is well handled, and the perf payoff is real. The main ask before this leaves draft is nailing down that "no visual change" claim against those 8 repros; splitting out the parser fix would make both halves easier to land. Happy to re-review. 🙏

Severity tally: 🔴 0 · 🟡 3 · 💡 1 · 🟢 1 · 🎉 4 — draft PR, posted as COMMENT. Reviewed the full diff at head 501fb49; XSS pass clean; did not run the suites locally (working tree is on an unrelated branch).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants