Skip to content

fix #7091: Ensure only one word is allowed between 'state' and '{'#7570

Merged
knsv merged 9 commits into
mermaid-js:developfrom
PinguinsRule:bug/7091_fix_parsing_bug
Jun 4, 2026
Merged

fix #7091: Ensure only one word is allowed between 'state' and '{'#7570
knsv merged 9 commits into
mermaid-js:developfrom
PinguinsRule:bug/7091_fix_parsing_bug

Conversation

@PinguinsRule

@PinguinsRule PinguinsRule commented Apr 3, 2026

Copy link
Copy Markdown
Contributor

The parser allowed multiple words between the 'state' keyword and the '{' character, leading to incorrect parsing of state diagrams.

📑 Summary

Added a new rule to the lexer to enforce a single-word constraint between 'state' and '{'. This ensures invalid syntax is rejected with an appropriate error message.

Resolves #7091

📏 Design Decisions

This new rule checks if at least two words are present before a '{'. If so, it throws an error.
Created a new test to verify the fix works.

📋 Tasks

  • 📖 have read the contribution guidelines
  • 💻 have added necessary unit/e2e tests.
  • 📓 have added documentation. Make sure MERMAID_RELEASE_VERSION is used for all new features.
  • 🦋 If your PR makes a change that should be noted in one or more packages' changelogs, generate a changeset by running pnpm changeset and following the prompts. Changesets that add features should be minor and those that fix bugs should be patch. Please prefix changeset messages with feat:, fix:, or chore:.

@changeset-bot

changeset-bot Bot commented Apr 3, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: 8810f62

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
mermaid Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@netlify

netlify Bot commented Apr 3, 2026

Copy link
Copy Markdown

Deploy Preview for mermaid-js ready!

Name Link
🔨 Latest commit 8810f62
🔍 Latest deploy log https://app.netlify.com/projects/mermaid-js/deploys/6a2131a584b0cb00085ced83
😎 Deploy Preview https://deploy-preview-7570--mermaid-js.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
🤖 Make changes Run an agent on this branch

To edit notification comments on pull requests, go to your Netlify project configuration.

@github-actions github-actions Bot added the Type: Bug / Error Something isn't working or is incorrect label Apr 3, 2026
@pkg-pr-new

pkg-pr-new Bot commented Apr 3, 2026

Copy link
Copy Markdown

Open in StackBlitz

@mermaid-js/examples

npm i https://pkg.pr.new/@mermaid-js/examples@7570

mermaid

npm i https://pkg.pr.new/mermaid@7570

@mermaid-js/layout-elk

npm i https://pkg.pr.new/@mermaid-js/layout-elk@7570

@mermaid-js/layout-tidy-tree

npm i https://pkg.pr.new/@mermaid-js/layout-tidy-tree@7570

@mermaid-js/mermaid-zenuml

npm i https://pkg.pr.new/@mermaid-js/mermaid-zenuml@7570

@mermaid-js/parser

npm i https://pkg.pr.new/@mermaid-js/parser@7570

@mermaid-js/tiny

npm i https://pkg.pr.new/@mermaid-js/tiny@7570

commit: 8810f62

@codecov

codecov Bot commented Apr 3, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 2.83%. Comparing base (9030126) to head (8810f62).

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##           develop   #7570   +/-   ##
=======================================
  Coverage     2.83%   2.83%           
=======================================
  Files          667     667           
  Lines        72262   72262           
  Branches       989     989           
=======================================
  Hits          2047    2047           
  Misses       70215   70215           
Flag Coverage Δ
unit 2.83% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@argos-ci

argos-ci Bot commented Apr 3, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Argos notifications ↗︎

Build Status Details Updated (UTC)
default (Inspect) ✅ No changes detected - Jun 4, 2026, 8:11 AM

@knsv knsv left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[sisyphus-bot]

Thanks for tackling this, @PinguinsRule — it's a real bug that's been confirmed and approved, and it's great to see it addressed with a test. Let's get this across the finish line!

File Triage

Tier Count Files
Tier 2 (diff + context) 2 stateDiagram.jison, state-parser.spec.js

What's working well

🎉 [praise] Good issue identification — the fix correctly targets the root cause: the JISON lexer matches multiple COMPOSIT_STATE tokens when several words appear between state and {, and the old grammar silently used only the last one.

🎉 [praise] The test verifies the error path clearly and the error message is helpful for users.

🎉 [praise] The new explicit action block on the COMPOSIT_STATE NL rule ($$={ stmt: 'state', id: $1, ... }) is actually an improvement over the old bare | COMPOSIT_STATE rule, which had no action and defaulted to $$=$1 (a plain string). The extract() method in stateDb.ts:233 switches on item.stmt — a plain string wouldn't match any case, so standalone state myState declarations were silently dropped. Your change fixes this too.

Things to address

🟡 [important] — Error rule only catches exactly 2 words before {

stateDiagram.jison — The new grammar rule COMPOSIT_STATE COMPOSIT_STATE STRUCT_START document STRUCT_STOP catches exactly two words (e.g., state foo bar { ... }). But the original issue example has seven words: state only the last word is taken into account { X }.

With 3+ words, the parser will produce a generic JISON parse error instead of your friendly "State name must be a single word." message. The fix still rejects the invalid input (good!), but the error message is worse for the exact case reported in the issue.

A more robust approach would be to catch this in the lexer rather than the grammar. For example, in the <STATE> lexer state, you could detect multiple non-whitespace tokens before { and throw there — the lexer sees the full remaining input and can regex-match the multi-word pattern. Alternatively, you could accumulate words in the grammar using a recursive rule. Worth considering which approach gives the best user experience.

🟡 [important] — Missing changeset

The PR checklist shows the changeset box is unchecked. This is a user-facing bug fix, so it needs a changeset:

pnpm changeset
# Select packages/mermaid, patch bump, prefix with fix:

🟡 [important] — <STATE>\n now returns NL — potential side effects need test coverage

stateDiagram.jison:117 (on develop) — Previously <STATE>\n just popped the state without returning a token. Now it returns NL. This changes the token stream for every state <name> declaration that ends with a newline (not just the multi-word case). Combined with changing | COMPOSIT_STATE| COMPOSIT_STATE NL, this alters parsing of all standalone state declarations.

I believe this is actually safe (and the action block improvement noted above makes it more correct), but it needs regression tests to prove it. Please add tests for:

  • state myState on its own line (standalone declaration, no {)
  • state myState { X } still works (composite state, single word — the happy path)
  • state "Name" as id { X } still works (quoted name composite)

🟢 [nit] — Consider testing the 3+ word case too

It would be valuable to have a test showing what happens with state a b c { X } — even if the error message differs, confirming it rejects is useful for documenting the behavior.

🟢 [nit] — The <STATE>\s+"state"\s+ lexer rule

stateDiagram.jison — This new rule handles the edge case of state appearing as a keyword inside the STATE lexer state (e.g., state foo state bar {). It re-enters STATE and returns NL, effectively splitting this into two separate statements. This is reasonable, but it would be good to add a brief comment explaining why this rule exists, since the interaction between lexer states is non-obvious.

Security

No XSS or injection issues identified. The changes are confined to the JISON parser grammar — no DOM sinks, no SVG output changes, no sanitization modifications. Error messages use a static string, not user input.

Self-Check

  • At least one 🎉 [praise] item exists
  • No duplicate comments
  • Severity tally: 0 🔴 blocking / 3 🟡 important / 2 🟢 nit / 0 💡 suggestion / 3 🎉 praise
  • Verdict matches criteria: REQUEST_CHANGES (3 🟡)
  • Not a draft PR — REQUEST_CHANGES is appropriate
  • No inline comments used
  • Tone check: collaborative and constructive ✓

@knsv

knsv commented Apr 7, 2026

Copy link
Copy Markdown
Collaborator

Thanks @PinguinsRule! Looking forward to getting this merged!

@PinguinsRule

Copy link
Copy Markdown
Contributor Author

I managed to fix the bug by moving the solution from the grammar to the lexer, as advised. I apologize however for the failing checks, I believe they might be a result of the changeset, it is my first time doing a changeset and might have done something wrong. I will now proceed to changing the PR message to a more fitting one.

@knsv-bot knsv-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[sisyphus-bot]

Thanks for tackling this — the lexer-level approach is the right instinct for catching the bad syntax early with a clear error message. 🎉 [praise] The single-rule addition is minimal and the regex correctly handles the cases I tested (single-word IDs, hyphenated names, and quoted descriptions with as all still work).

That said, there's one critical issue that needs to be sorted before this can land:

🔴 [blocking] Branch is severely out of date — would revert recent fixes to this file

This PR was branched before three other state-diagram fixes landed on develop (#7508, #7520, plus a couple of follow-ups: end-note detection, classDef in composite states, single % parsing). Because the branch wasn't rebased, the diff on develop now includes accidental reverts of all of them.

Concretely, when applied to current develop, this PR reverts:

  • processId() helper and its call sites in stateDiagram.jison (handles inline %% comments split from IDs)
  • <INITIAL,ID,STATE,struct,LINE>\%\%(?!\{)[^\n]* → degraded to \%%[^\n]* (no longer skips %% comments inside STATE/struct)
  • <NOTE_TEXT>[\s\S]*?\n\s*"end note"<NOTE_TEXT>[\s\S]*?"end note" (broken end-note detection inside text)
  • <INITIAL,struct>":::"":::" (state restriction lost)

The result is 3 failing tests in state-style.spec.js:

  • ::: syntax inside composite states > can be applied to a state inside a composite state
  • ::: syntax inside composite states > can be applied to a [*] state inside a composite state
  • comments parsing > should parse single % as normal syntax, not a comment

Good news: I tested applying just your new <STATE>\w+\s+\w+.*?\{ rule on top of current develop (no other changes) and all 134 state-parser/style tests pass, plus your new test passes. So a clean rebase should resolve everything. Could you git rebase develop and force-push? Happy to re-review immediately.

🟡 [important] Test coverage could be tighter

The new test is a great start. Two additions would harden it:

  • A "still works" case for valid single-word composite states (e.g., state foo { X }) — guards against accidental future regression of the new rule.
  • The 3+ word case mentioned in your commit Added test case for 3+ word case — actually exercise state foo bar baz { X } as its own focused assertion (the existing test mixes it in but doesn't isolate it).

packages/mermaid/src/diagrams/state/parser/state-parser.spec.js:17-25

🟡 [important] Changeset description has a typo

.changeset/tired-rockets-rule.md: "Fix invalid syntax between state and '}'" — should be '{' (opening brace, not closing). Worth fixing for the release notes.

🟢 [nit] Trailing newline removed from stateDiagram.jison

The diff drops the final newline (\ No newline at end of file). The repo's other JISON files end with a newline; restoring it keeps things consistent and avoids a small Prettier/lint annoyance.


Once rebased, this is a straightforward improvement. Thanks for sticking with it!

…and '{'

The parser allowed multiple words between the 'state' keyword and the '{' character, leading to incorrect parsing of state diagrams.

Added a new rule to the parser to enforce a single-word constraint between 'state' and '{'. This ensures invalid syntax is rejected with an appropriate error message.
Added test case for 3+ word case
@PinguinsRule PinguinsRule force-pushed the bug/7091_fix_parsing_bug branch from 2ef92f1 to c2305df Compare April 24, 2026 13:58
@PinguinsRule

Copy link
Copy Markdown
Contributor Author

Seems like the PR is failing due to a quota limit on screenshots. I have rebased as advised, fixed the typo in the changeset, added the "still-works" case and split the already existing test into two separate tests, also as advised.

@PinguinsRule PinguinsRule requested a review from knsv-bot April 30, 2026 15:26
@ashishjain0512

Copy link
Copy Markdown
Collaborator

@PinguinsRule The Argos quota is re-newed, pulling latest from develop and re-running the test

@PinguinsRule PinguinsRule requested a review from knsv May 24, 2026 12:21
@PinguinsRule

Copy link
Copy Markdown
Contributor Author

@knsv-bot Sorry to bother you, I would like to know if it's possible to review the changes.

@pbrolin47

Copy link
Copy Markdown
Collaborator

Hi @PinguinsRule,
Thanks for addressing the comments by @knsv-bot and @knsv.

[sisyphos-bot]

Addressing knsv-bot's four items

✅ 🔴 Branch out of date / reverted fixes — resolved

The rebase was clean. All four fixes that were accidentally reverted are confirmed present in the current file:

  • processId() function — line 49 ✓
  • <INITIAL,ID,STATE,struct,LINE>%%(?!{)[^\n]* — line 84 ✓
  • <NOTE_TEXT>[\s\S]?\n\s"end note" — line 149 ✓
  • <INITIAL,struct>":::" — line 163 ✓

The current diff is clean and minimal: only the new lexer rule + 3 tests + changeset. No unintended changes.

✅ 🟡 Test coverage — resolved

The restructured test suite is well-organized:

describe('valid syntax')
it('should only accept 1 word') ← new "still works" guard ✓

describe('invalid syntax')
it('should throw error with 2 words') ← exact issue example ✓
it('should also throw with more than 2 words') ← 3+ word case ✓

The states.get('valid') assertion in the "still works" test proves the valid path still populates the DB correctly — not just that it doesn't throw.

✅ 🟡 Changeset typo — resolved

Fix invalid syntax between state and '{' — { is correct now. ✓


Remaining minor item

🟢 [nit] Trailing newline still removed

stateDiagram.jison:352 — The diff still shows \ No newline at end of file on the final %% line. The original develop file ends with a newline; the PR removes it. This was called
out as a nit in the last review and wasn't addressed. Not a blocker, but worth restoring for consistency — most JISON editors expect a trailing newline. (echo "" >>
stateDiagram.jison or add a newline at the end of the file in your editor.)


Security

The new lexer rule throws a static Error string that embeds yytext.trim() (e.g., "foo bar {" — the matched text). This propagates as a JavaScript exception through the render
pipeline, never reaching any DOM sink or SVG attribute. No XSS concern.


@pbrolin47 pbrolin47 added this pull request to the merge queue May 28, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to no response for status checks May 28, 2026
@PinguinsRule

Copy link
Copy Markdown
Contributor Author

Seems like adding the PR to the merge queue failed.

@knsv knsv enabled auto-merge June 4, 2026 08:05
@knsv knsv added this pull request to the merge queue Jun 4, 2026
Merged via the queue into mermaid-js:develop with commit 0c00846 Jun 4, 2026
21 checks passed
@mermaid-bot

mermaid-bot Bot commented Jun 4, 2026

Copy link
Copy Markdown

@PinguinsRule, Thank you for the contribution!
You are now eligible for a year of Premium account on MermaidChart.
Sign up with your GitHub account to activate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Type: Bug / Error Something isn't working or is incorrect

Projects

None yet

Development

Successfully merging this pull request may close these issues.

State diagram: syntax errors in the state keyword are not checked properly.

5 participants