perf(rendering): batch node-label measurement + construction#7872
perf(rendering): batch node-label measurement + construction#7872knsv-bot wants to merge 16 commits into
Conversation
The per-node measure loop builds a label's DOM and immediately reads its size (getBoundingClientRect / getBBox), interleaved across all nodes — so each read forces a synchronous reflow over the growing tree. On a large dagre flowchart this was ~2,216 Layout events / 173 ms (45% of the measure phase). Split label handling into build -> measure -> finalize, and add prebuildNodeLabels: build every label first (DOM writes only; the HTML path now defers its internal getBoundingClientRect via createText's `deferMeasure`), then read all sizes back-to-back so only the first read forces a reflow, then write the positions. labelHelper returns these prebuilt labels; anything not prebuilt (groups, linked nodes, shapes that build their own label) falls back to the inline build+measure path unchanged, so this is purely an optimization. Wired into createGraphWithElements (the common/ELK measure path). The dagre recursiveRender path is a separate follow-up. NOTE: jsdom returns 0 for getBoundingClientRect/getBBox, so unit tests cover structure only — the actual reflow reduction and measured-size correctness must be validated in real Chrome (cypress visual snapshots + the Dev Explorer profiler). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Apply the same build-all-then-measure-all batching to the dagre layout path, which measures nodes inside recursiveRender (its measureLayout hook is a no-op). Before each recursion level's node loop, prebuild + batch-measure the level's leaf-node labels; the loop's insertNode calls then consume the prebuilt labels, collapsing the per-node label reflow to roughly one per recursion level. clearPrebuiltLabels now takes the specific nodes a prebuild produced, so nested concurrent subgraph renders clear only their own pending entries. Same validation caveat as the common-path change: jsdom can't measure, so the reflow reduction and size correctness need real-Chrome (cypress + profiler) verification. This is the path the original trace profiled. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The warm benchmark shows getBBox/getBoundingClientRect (the reflow reads) are only ~4% of the measure phase — the rest is unattributed DOM construction. Add two buckets to find it: - sanitize: sanitizeMore + DOMPurify cost (per sanitizeText call). - labelBuild: the full label DOM construction (createText) in the prebuild loop. Both guarded by injected.profiling. sanitize uses tickSync (sync, so accurate under concurrency); labelBuild uses the async tick in the sequential prebuild loop, where wall-clocks don't overlap. Diagnostic instrumentation to decide where the warm measure cost actually is (DOMPurify vs DOM building vs shapes). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a `markdown` bucket around markdownToHTML/nonMarkdownToHTML so labelBuild can be split into markdown parsing vs element creation: element-creation ≈ labelBuild − markdown − sanitize. This decides whether a plain-text markdown fast-path is worthwhile or the cost is in DOM element construction. tickSync (sync) so it's accurate under the concurrent label builds. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add labelPrep (decodeEntities + replaceIconSubstring regex) and addHtmlSpan (foreignObject/div/span DOM construction) buckets so the ~90%-of-labelBuild "element-creation" cost can be attributed to regex prep vs actual DOM building — deciding between a trivial plain-label fast-path and a DOM-construction rewrite. Async tick, accurate in the sequential prebuild loop. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ync decode The labelPrep/addHtmlSpan buckets used the async `tick`, which over-counts when createText runs concurrently (recursion/clusters/fallback) — they reported physically impossible values (150k+ ms). Drop them. decodeEntities is sync, so measure it with tickSync (accurate under concurrency); replaceIconSubstring is a no-op scan for plain labels (negligible). Confirms regex prep is not the cost — element-creation is DOM construction in addHtmlSpan. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…HTML Label element-creation (foreignObject/div/span per node) is ~50% of the warm measure phase on large flowcharts, and a console A/B showed the HTML parser builds these sub-trees ~2x faster than the DOM API. When every prebuilt label is HTML, assemble the markup for all of them into one string and parse it with a single insertAdjacentHTML instead of per-element createElement/setAttribute. - createText: extract prepareHtmlLabelNode (shared text pipeline) and add htmlLabelMarkup (string form of addHtmlSpan, identical output) + escapeAttr + registerDeferredHtmlLabel. - util.ts: prebuildNodeLabels routes through buildNodeLabelsBatched when a one-time probe confirms the environment parses batched SVG/xhtml markup with correct namespaces; otherwise (e.g. jsdom) falls back to per-element construction. A per-render structure check undoes and falls back if the parse is ever unexpected, so correctness never depends on the parser. Validation: jsdom unit tests green via the fallback/probe; the batched path's visual correctness must be confirmed by the cypress snapshot suite, and the win by re-benchmarking huge1. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
🦋 Changeset detectedLatest commit: 501fb49 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## develop #7872 +/- ##
==========================================
+ Coverage 2.91% 2.97% +0.05%
==========================================
Files 656 656
Lines 70441 70715 +274
Branches 979 982 +3
==========================================
+ Hits 2055 2102 +47
- Misses 68386 68613 +227
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
|
The latest updates on your projects. Learn more about Argos notifications ↗︎
|
# Conflicts: # packages/mermaid/src/rendering-util/rendering-elements/shapes/util.ts
… color)
prebuildNodeLabels builds and measures leaf-node labels up front, but icon/image
shape handlers set node.labelStyle (the classDef-derived label color), override
node.width, and pass a shape-specific label class ('icon-shape'/'image-shape') to
labelHelper *after* the prebuild pass. A prebuilt icon/image label was therefore
built and measured with the wrong class, stale style and stale width — dropping
the label's padding and its classDef color (e.g. white text rendered black).
Exclude node.icon/node.img from the prebuild filter so they fall back to the
inline build path inside their handler. Regular shapes keep the batched prebuild:
they pass getNodeClasses(node) and don't override width, matching what prebuild
uses, so they're unaffected.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- updateNodeBounds: guard against degenerate (zero/negative) knownBounds so it
falls back to a real getBBox() measurement instead of committing the node to
0×0.
- Downgrade the stray `log.warn('Returning from recursive render XAX', …)` debug
trace to `log.debug` (the expensive 'Graph after XAX' graph-serialization log
was already commented out).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two whitespace-related regex pathologies made a deeply-indented flowchart (e.g. the perf fixture huge3, with 1.6k-space lines) cost ~1.4s to parse — both O(whitespace²): - anyCommentRegex (diagram-api/regexes.ts), used by detectType + preprocess: the leading greedy \s* + global flag re-scanned each indent run from every position. Guard the start with (?:^|(?<=\S)) so \s* is attempted once per run — O(n), and byte-identical output (new regexes.spec.ts asserts equivalence vs the legacy pattern over a corpus + an O(n) regression guard). - flow.jison SPACE rule matched a single whitespace char (\s), emitting one SPACE token per space — hundreds of thousands on deep indentation. Match a run ([^\S\n\r]+) instead; the grammar only uses SPACE as a separator, so parsing is unchanged (full flowchart suite green, 991 tests). huge3 Diagram.fromText: ~1410ms -> 27ms.
✅ Deploy Preview for mermaid-js ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
@mermaid-js/examples
mermaid
@mermaid-js/layout-elk
@mermaid-js/layout-tidy-tree
@mermaid-js/mermaid-zenuml
@mermaid-js/parser
@mermaid-js/tiny
commit: |
…rebuild prebuildNodeLabels builds and measures each leaf node's label up front, then labelHelper hands that prebuilt label back to the shape. But several shapes mutate the node *after* the prebuild pass runs: - hourglass blanks node.label -> the prebuilt label still showed the text - erBox rewrites node.label to the entity alias and bumps node.width -> the prebuilt label showed the id and was sized/positioned wrong (title missing) - shapes apply a classDef-derived node.labelStyle via styles2String -> the prebuilt label was built with the un-styled labelStyle, dropping label colour labelHelper now stamps each prebuilt label with a cheap signature of the node state it was built from (label, width, markdown, labelStyle, html-labels, centerLabel, icon/img) and only reuses it when the node still matches. When a shape has customised the node the signature differs, so we drop the stale prebuilt DOM and rebuild inline -- exactly the pre-perf path. Unchanged plain nodes (the bulk of large diagrams) still take the fast prebuilt path. Also run configureLabelImages on the batched markup build path: like the per-element path it must wait for <img> labels to load and apply their sizing styles before measuring, otherwise image nodes are mis-sized and the images don't show. Repro fixtures added under cypress/platform/dev-diagrams/render-regressions/.
A prebuilt leaf-node label is inserted into the nodes group during the prebuild pass, before any shape runs. erBox draws a hand-drawn background rect as a sibling g first and then builds the entity, relying on the entity's g being appended after the background so it paints on top. When labelHelper reused a prebuilt label in place, it stayed at its earlier position, so the background painted over the entity and hand-drawn ER entities lost their background fill. labelHelper now raises the reused g to the end of its parent, matching the inline path (which appends the label at shape-render time). Short-named ER entities were already correct because their width bump flips the label signature and forces an inline rebuild; this covers the rest. Repro fixture: er-handdrawn-entity-background.mmd.
knsv-bot
left a comment
There was a problem hiding this comment.
[sisyphus-bot]
Thanks for this — the build-all-then-measure-all batching is a genuinely smart attack on the per-node forced-reflow cost, and the engineering care shows. Since it's a draft, this is early-direction feedback, not a merge gate; nothing below is blocking. I read the full diff at head 501fb49, traced the new string-HTML path, ran an XSS pass, and read the regression-fixtures README. (Heads-up: the pre-fetch flagged this as a "new diagram" — false positive; it's a perf refactor, so the new-diagram checklist doesn't apply.)
What's working well
- 🎉 The batching architecture is the right shape. Splitting
labelHelperintobuildNodeLabel(writes) →measureNodeLabel(the single forced read) →finalizeNodeLabel(writes), so a batch runs every read back-to-back, is exactly how you collapse the reflow storm. The validated numbers (−17% / −45% measure) back it up. - 🎉 The correctness guards are thoughtful, not hand-wavy. The
labelSignaturefingerprint to detect post-prebuild node mutation (hourglass blanking the label, erBox rewriting width, classDef-derivedlabelStyle) and fall back to the exact inline path; thecanBatchSvgHtml()probe with per-element fallback so correctness never depends on the parser;clearPrebuiltLabels(nodes)scoped to avoid clobbering concurrent nested subgraph renders; theraise()to preserve paint order; and the degenerate-bounds guard inupdateNodeBounds— these are the subtle pitfalls of caching+reordering DOM, and you've handled each. The commit history (icon/image padding, post-prebuild rebuild, paint order) shows they were found and fixed deliberately. - 🎉
regexes.spec.tsis a model equivalence test — keeping the legacy pattern as a byte-for-byte oracle over a whitespace-edge corpus plus a backtracking-perf assertion is exactly how to make a regex micro-opt reviewable. - 🎉 The render-regressions README is honest and useful — mapping each repro to the specific Cypress/Argos test that catches it is great traceability.
Things to consider (for when you take it out of draft)
🟡 "No visual change" vs. a folder of 8 known regressions — please confirm each is resolved on head
This is the whole correctness bar for the PR, so it's worth being explicit. The README documents 8 rendering regressions this perf work introduced (note text wrapping differently, hourglass rendering a label it shouldn't, ER entity titles vanishing, <img> icons not showing, box-label alignment flips, SVG markdown wrapping at different words). The commits fix several — but the README frames them as "compare against develop to confirm the regression," which reads as open. Before un-drafting, it'd be great to confirm every listed Argos test is green on head (especially imageshape-markdown-svg-wrap and class-notes-simple-alignment, which are wrap/alignment shifts the batched-markup path can reintroduce), and reword the README so it's clear these are fixed repros, not outstanding ones.
🟡 Scope: two unrelated perf concerns are bundled
The headline (label batching) and the flowchart parser O(n²) fix (flow.jison SPACE token + anyCommentRegex, with the flowchart-deep-indent-parse-perf changeset and regexes.spec.ts) are independent — different subsystem, different changeset, different tests. The parser fix is cleanly separable and would be much easier to land and bisect on its own. Worth splitting it into its own PR (it also looks ready/lower-risk than the batching, so it could land first).
🟡 The jison SPACE change needs the full flowchart parser suite green
\s → [^\S\n\r]+ (one token per whitespace run instead of per char) is a lexer grammar change. The "grammar only uses SPACE as a separator, so parsing is unchanged" claim is plausible, but jison changes are regression-prone and don't hot-reload (dev server restart needed). Please confirm the flowchart .spec suite passes. One edge to sanity-check: a lone \r not followed by \n used to match \s as SPACE and now matches nothing in that rule — almost certainly irrelevant, but worth a thought.
💡 Changeset coverage
The only changeset is for the bundled parser fix. The headline label-batching change has none — for a user-affecting perf improvement touching shared rendering-util/ I'd add a patch changeset (or, if you split per the point above, one per PR).
🟢 anyCommentRegex lookbehind
(?:^|(?<=\S)) uses a lookbehind — fine on current Node/browsers, just confirm it clears the project's minimum-supported-runtime matrix.
Security
Ran a dedicated XSS pass over the new string-built HTML path. No XSS or injection issues identified. htmlLabelMarkup routes all attribute values through escapeAttr (sufficient for the double-quoted attrs used) and all label content through sanitizeText/renderKatexSanitized — the same sanitization the inline addHtmlSpan path already applied (the "double sanitize" is pre-existing and harmless). The canBatchSvgHtml() probe verifies namespace resolution and bails to per-element construction otherwise, so insertAdjacentHTML adds no mXSS surface, and the final DOMPurify.sanitize in mermaidAPI is untouched. The sanitizeText refactor preserves the exact DOMPurify config branches.
Really strong work — the hard part (making batched DOM caching correct) is well handled, and the perf payoff is real. The main ask before this leaves draft is nailing down that "no visual change" claim against those 8 repros; splitting out the parser fix would make both halves easier to land. Happy to re-review. 🙏
Severity tally: 🔴 0 · 🟡 3 · 💡 1 · 🟢 1 · 🎉 4 — draft PR, posted as COMMENT. Reviewed the full diff at head 501fb49; XSS pass clean; did not run the suites locally (working tree is on an unrelated branch).
Summary
The steady-state (warm) win: batch the two most expensive per-node operations in
the measure phase — size measurement and label DOM construction — plus the
profiling that pinpointed them. No visual change (cypress snapshots clean).
Stacked on the surgical-reflow PR.
Validated results (dagre folder benchmark, trimmed mean)
measuremeasureFlat / label-heavy flowcharts (the common case) gain −39 to −49% of measure.
Subgraph-heavy diagrams gain less — they're bound by nested dagre layout, not DOM.
Changes
prebuildNodeLabelsinshapes/util.ts, wired into thecommon
createGraphWithElementsand dagremeasureDagreGraph): build every labelfirst (deferring its
getBoundingClientRect), then read all sizes back-to-back soonly the first forces a reflow.
createText.tshtmlLabelMarkup+util.tsbuildNodeLabelsBatched): assemble the markup for all HTML labels into one stringand parse it with a single
insertAdjacentHTML— the HTML parser builds theforeignObject/div/span sub-trees ~2× faster than per-element
createElement(measured:
labelBuild−33% set, −49% on flat diagrams).ever misbehaves (e.g. jsdom), it falls back to per-element construction, so
correctness never depends on the parser.
sanitize/markdown/decode/labelBuild)that decompose the phase — how each win above was found and proven. Guarded by
injected.profiling, tree-shaken from production.Note on the dagre→common-pipeline migration
This rebases cleanly onto that migration: the dagre measure span is dropped
(redundant — the common renderer's
measurespan now wrapsmeasureDagreLayout),while the dagre batching stays and runs inside that measure phase via the new
measureDagreGraph.Testing
Unit tests pass (the probe + fallback keep jsdom green); cypress visual snapshots
clean; zero new type errors.
🤖 Generated with Claude Code