Skip to content

feat: Rebuild parser from first principles#3

Merged
bmazzarol merged 10 commits into
mainfrom
feat/new-parser
Aug 2, 2025
Merged

feat: Rebuild parser from first principles#3
bmazzarol merged 10 commits into
mainfrom
feat/new-parser

Conversation

@bmazzarol

@bmazzarol bmazzarol commented Aug 2, 2025

Copy link
Copy Markdown
Owner

Rebuilds the parser to remove all external dependencies and move closer to liquid.

Summary by CodeRabbit

  • New Features

    • Introduced a new lexer and parser for template processing with {% ... %} control flow syntax and whitespace suppression.
    • Added extensive unit tests covering lexer tokenization and parser syntax validation.
    • Enhanced error reporting with detailed parse exceptions.
  • Bug Fixes

    • Improved whitespace handling and suppression in template rendering.
  • Refactor

    • Replaced previous parsing and tokenization system with a new architecture.
    • Updated internal data structures and namespaces for tokens and syntax nodes.
    • Changed accessibility of internal classes and consolidated parsing logic.
  • Documentation

    • Updated README and documentation to reflect new {% ... %} template syntax and usage examples.
  • Chores

    • Upgraded multiple package dependencies and reformatted project files for consistency.
    • Added new diagnostic rules for code style and test assertions.
  • Tests

    • Added comprehensive lexer and parser unit tests.
    • Removed legacy parser test suite in favor of updated tests.

@bmazzarol bmazzarol self-assigned this Aug 2, 2025
@bmazzarol bmazzarol added the enhancement New feature or request label Aug 2, 2025
@coderabbitai

coderabbitai Bot commented Aug 2, 2025

Copy link
Copy Markdown

Walkthrough

This change refactors the Cutout templating engine by replacing its lexer and parser with new implementations, introducing a token-based architecture. Template syntax delimiters are updated from {{ ... }} to {% ... %} for control flow statements. The renderer, source generator, and tests are revised to align with the new tokenization and parsing model. Documentation and configuration files are updated accordingly.

Changes

Cohort / File(s) Change Summary
Lexer & Token Infrastructure
Cutout/Lexer/Lexer.cs, Cutout/Lexer/Lexer.Context.cs, Cutout/Lexer/Lexer.ApplyWhitespaceSuppression.cs, Cutout/Lexer/Token.cs, Cutout/Extensions/TokenListExtensions.cs, Cutout/GlobalUsings.cs
Introduced a new token-based lexer, token structures, whitespace suppression logic, token list extensions, and global type aliases for tokens.
Parser & Syntax Tree
Cutout/Parser/Parser.cs, Cutout/Parser/Parser.Context.cs, Cutout/Parser/Parser.ParseBlock.cs, Cutout/Parser/Syntax.cs
Implemented a new parser with recursive descent logic, block parsing context, and updated internal syntax tree structures for tokens.
Removed Legacy Parser
Cutout/Parser/TemplateParser.cs, Cutout/Parser/TemplateParser.CallStatement.cs, Cutout/Parser/TemplateParser.EndBreakPredicate.cs, Cutout/Parser/TemplateParser.ForStatements.cs, Cutout/Parser/TemplateParser.IParseContext.cs, Cutout/Parser/TemplateParser.IfStatements.cs, Cutout/Parser/TemplateParser.KeywordStatement.cs, Cutout/Parser/TemplateParser.VarStatement.cs, Cutout/Parser/Identifiers.cs, Cutout/Extensions/TokenExtensions.cs
Deleted the old template parser, parsing helpers, identifier constants, and token extension methods.
Renderer Refactor
Cutout/Renderer/Renderer.cs, Cutout/Extensions/IndentedTextWriterExtensions.cs, Cutout/Extensions/SyntaxExtensions.cs
Updated the renderer to use token-based syntax, adjusted method signatures, improved whitespace and context handling, and changed namespaces for extension classes.
Source Generation & Attribute Handling
Cutout/TemplateSourceGenerator.TemplateMethodImpl.cs, Cutout/TemplateSourceGenerator.cs, Cutout/TemplateAttributeParts.cs
Refactored template source generation to use the new lexer/parser, updated method calls, improved whitespace handling, and added XML documentation.
Project & Build Configurations
.globalconfig, Common.Benchmark.props, Common.Docs.props, Common.Release.props, Common.Test.props, Cutout/Cutout.csproj, Cutout.Sample/Cutout.Sample.csproj, Cutout.Tests/Cutout.Tests.csproj, Directory.Packages.props, Parent.Directory.Packages.props
Reformatted XML, updated package references and versions, added diagnostic rules, and removed obsolete dependencies.
Documentation
README.md, Cutout/README.md
Updated documentation to reflect new template delimiters, whitespace controls, and syntax changes.
Test Suite Overhaul
Cutout.Tests/LexerTests.cs, Cutout.Tests/ParserTests.cs
Added comprehensive new lexer and parser tests for the token-based architecture.
Removed Legacy Tests
Cutout.Tests/TemplateParserTests.cs
Deleted the old parser test suite.
Test Syntax & Formatting Updates
Cutout.Tests/CallStatementTests.cs, Cutout.Tests/ForStatementTests.cs, Cutout.Tests/IfStatementTests.cs, Cutout.Tests/TemplateTests.cs, and multiple *.g.verified.cs/*.wsr.g.verified.cs files
Updated test templates and expected outputs to use new {% ... %} delimiters and adjusted whitespace formatting throughout.

Sequence Diagram(s)

sequenceDiagram
    participant TemplateSourceGenerator
    participant Lexer
    participant Parser
    participant Renderer

    TemplateSourceGenerator->>Lexer: Tokenize(template)
    Lexer-->>TemplateSourceGenerator: TokenList
    TemplateSourceGenerator->>Lexer: ApplyWhitespaceSuppression(TokenList)
    Lexer-->>TemplateSourceGenerator: TokenList (whitespace suppressed)
    TemplateSourceGenerator->>Parser: Parse(TokenList, template)
    Parser-->>TemplateSourceGenerator: SyntaxList
    TemplateSourceGenerator->>Renderer: WriteSyntax(template, Syntax, lastSyntax, includeWhitespaceReceiver)
    Renderer-->>TemplateSourceGenerator: Generated code
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~90+ minutes

Possibly related PRs

  • feat: use a parser and simplify the syntax #2: Introduced the initial Cutout template engine, including early parsing and testing infrastructure. Related as both PRs overhaul template parsing, but this PR fully replaces the lexer, parser, and rendering logic.

Poem

A rabbit hopped through fields of code,
Swapping curly braces for percent signs in its abode.
Tokens now march in tidy rows,
Lexer and parser—how swiftly it goes!
Old tests retired, new ones bloom,
In whitespace-trimmed, template-syntax room.
🐇✨ The engine’s reborn—let’s render and zoom!

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/new-parser

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 24

🔭 Outside diff range comments (3)
Cutout/Extensions/SyntaxExtensions.cs (1)

9-23: TryGetUsings misses using directives declared on the node itself

Because it walks node.Ancestors() (excluding the node), calling this on a NamespaceDeclarationSyntax or CompilationUnitSyntax will omit the usings declared directly on that node.

-return node.Ancestors(ascendOutOfTrivia: false)
+return node.AncestorsAndSelf(ascendOutOfTrivia: false)

Switching to AncestorsAndSelf captures the current scope as well and costs nothing.

Cutout/Renderer/Renderer.cs (1)

121-144: Template parameter correctly integrated

The implementation properly uses the template for string conversion. Note that ToString(template) is called twice (lines 129 and 135) - consider caching the result if performance becomes a concern.

Cutout/TemplateSourceGenerator.cs (1)

1-8: Removed using Cutout.Extensions; may break compilation

TryGetUsings() (Line 63) is an extension method that appears to live in Cutout.Extensions.
With the directive removed, the method will be out of scope unless a global using or namespace relocation was added elsewhere. This will surface as a compile-time error (e.g., CS1061).

+using Cutout.Extensions; // required for TryGetUsings extension method

Re-add the import or ensure a project-wide global using keeps the extension visible.

♻️ Duplicate comments (1)
Cutout.Tests/TemplateTests.Case4a#Test.TestWithModel.wsr.g.verified.cs (1)

22-24: Same optimisation opportunity as above for model.Text

See comment on Lines 16-18.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eba327e and 42d85f5.

📒 Files selected for processing (68)
  • .globalconfig (1 hunks)
  • Common.Benchmark.props (1 hunks)
  • Common.Docs.props (1 hunks)
  • Common.Release.props (1 hunks)
  • Common.Test.props (1 hunks)
  • Cutout.Sample/Cutout.Sample.csproj (1 hunks)
  • Cutout.Tests/CallStatementTests.Case1a#Test.Test.g.verified.cs (1 hunks)
  • Cutout.Tests/CallStatementTests.Case1a#Test.Test.wsr.g.verified.cs (1 hunks)
  • Cutout.Tests/CallStatementTests.cs (3 hunks)
  • Cutout.Tests/Cutout.Tests.csproj (1 hunks)
  • Cutout.Tests/ForStatementTests.Case1a#Test.Test.g.verified.cs (1 hunks)
  • Cutout.Tests/ForStatementTests.Case1a#Test.Test.wsr.g.verified.cs (1 hunks)
  • Cutout.Tests/ForStatementTests.Case2a#Test.Test.g.verified.cs (1 hunks)
  • Cutout.Tests/ForStatementTests.Case2a#Test.Test.wsr.g.verified.cs (1 hunks)
  • Cutout.Tests/ForStatementTests.Case3a#Test.Test.g.verified.cs (1 hunks)
  • Cutout.Tests/ForStatementTests.Case3a#Test.Test.wsr.g.verified.cs (2 hunks)
  • Cutout.Tests/ForStatementTests.cs (5 hunks)
  • Cutout.Tests/IfStatementTests.Case1a#Test.Test.g.verified.cs (1 hunks)
  • Cutout.Tests/IfStatementTests.Case1a#Test.Test.wsr.g.verified.cs (1 hunks)
  • Cutout.Tests/IfStatementTests.Case2a#Test.Test.g.verified.cs (1 hunks)
  • Cutout.Tests/IfStatementTests.Case2a#Test.Test.wsr.g.verified.cs (1 hunks)
  • Cutout.Tests/IfStatementTests.Case3a#Test.Test.g.verified.cs (1 hunks)
  • Cutout.Tests/IfStatementTests.Case3a#Test.Test.wsr.g.verified.cs (1 hunks)
  • Cutout.Tests/IfStatementTests.cs (4 hunks)
  • Cutout.Tests/LexerTests.cs (1 hunks)
  • Cutout.Tests/ParserTests.cs (1 hunks)
  • Cutout.Tests/TemplateParserTests.cs (0 hunks)
  • Cutout.Tests/TemplateTests.Case2a#Test.TestWithParameter.g.verified.cs (1 hunks)
  • Cutout.Tests/TemplateTests.Case2a#Test.TestWithParameter.wsr.g.verified.cs (1 hunks)
  • Cutout.Tests/TemplateTests.Case3a#Test.TestWithTwoParameters.g.verified.cs (1 hunks)
  • Cutout.Tests/TemplateTests.Case3a#Test.TestWithTwoParameters.wsr.g.verified.cs (1 hunks)
  • Cutout.Tests/TemplateTests.Case4a#Test.TestWithModel.g.verified.cs (1 hunks)
  • Cutout.Tests/TemplateTests.Case4a#Test.TestWithModel.wsr.g.verified.cs (1 hunks)
  • Cutout.Tests/TemplateTests.Case5a#Test.TestWithConstantTemplate.g.verified.cs (1 hunks)
  • Cutout.Tests/TemplateTests.Case5a#Test.TestWithConstantTemplate.wsr.g.verified.cs (1 hunks)
  • Cutout.Tests/TemplateTests.cs (2 hunks)
  • Cutout/Cutout.csproj (1 hunks)
  • Cutout/Exceptions/ParseException.cs (2 hunks)
  • Cutout/Extensions/IndentedTextWriterExtensions.cs (1 hunks)
  • Cutout/Extensions/SyntaxExtensions.cs (1 hunks)
  • Cutout/Extensions/TokenExtensions.cs (0 hunks)
  • Cutout/Extensions/TokenListExtensions.cs (1 hunks)
  • Cutout/GlobalUsings.cs (1 hunks)
  • Cutout/Lexer/Lexer.ApplyWhitespaceSuppression.cs (1 hunks)
  • Cutout/Lexer/Lexer.Context.cs (1 hunks)
  • Cutout/Lexer/Lexer.cs (1 hunks)
  • Cutout/Lexer/Token.cs (1 hunks)
  • Cutout/Parser/Identifiers.cs (0 hunks)
  • Cutout/Parser/Parser.Context.cs (1 hunks)
  • Cutout/Parser/Parser.ParseBlock.cs (1 hunks)
  • Cutout/Parser/Parser.cs (1 hunks)
  • Cutout/Parser/Syntax.cs (1 hunks)
  • Cutout/Parser/TemplateParser.CallStatement.cs (0 hunks)
  • Cutout/Parser/TemplateParser.EndBreakPredicate.cs (0 hunks)
  • Cutout/Parser/TemplateParser.ForStatements.cs (0 hunks)
  • Cutout/Parser/TemplateParser.IParseContext.cs (0 hunks)
  • Cutout/Parser/TemplateParser.IfStatements.cs (0 hunks)
  • Cutout/Parser/TemplateParser.KeywordStatement.cs (0 hunks)
  • Cutout/Parser/TemplateParser.VarStatement.cs (0 hunks)
  • Cutout/Parser/TemplateParser.cs (0 hunks)
  • Cutout/README.md (5 hunks)
  • Cutout/Renderer/Renderer.cs (4 hunks)
  • Cutout/TemplateAttributeParts.cs (2 hunks)
  • Cutout/TemplateSourceGenerator.TemplateMethodImpl.cs (1 hunks)
  • Cutout/TemplateSourceGenerator.cs (1 hunks)
  • Directory.Packages.props (1 hunks)
  • Parent.Directory.Packages.props (1 hunks)
  • README.md (5 hunks)
💤 Files with no reviewable changes (11)
  • Cutout/Parser/Identifiers.cs
  • Cutout/Parser/TemplateParser.EndBreakPredicate.cs
  • Cutout/Parser/TemplateParser.ForStatements.cs
  • Cutout/Parser/TemplateParser.VarStatement.cs
  • Cutout/Parser/TemplateParser.KeywordStatement.cs
  • Cutout/Extensions/TokenExtensions.cs
  • Cutout.Tests/TemplateParserTests.cs
  • Cutout/Parser/TemplateParser.CallStatement.cs
  • Cutout/Parser/TemplateParser.IParseContext.cs
  • Cutout/Parser/TemplateParser.IfStatements.cs
  • Cutout/Parser/TemplateParser.cs
🧰 Additional context used
🧬 Code Graph Analysis (3)
Cutout/Extensions/TokenListExtensions.cs (3)
Cutout/Lexer/Token.cs (1)
  • ToString (132-135)
Cutout/Lexer/Lexer.cs (1)
  • TokenList (8-66)
Cutout/Parser/Parser.ParseBlock.cs (1)
  • TokenList (34-38)
Cutout/Lexer/Lexer.cs (3)
Cutout/Lexer/Lexer.ApplyWhitespaceSuppression.cs (2)
  • Lexer (3-64)
  • TokenList (5-36)
Cutout/Lexer/Lexer.Context.cs (8)
  • Lexer (6-90)
  • Context (8-89)
  • Reset (59-64)
  • Reset (66-70)
  • MoveNext (54-57)
  • Token (79-88)
  • AdvanceNewline (28-37)
  • TryAdvance (14-24)
Cutout/Lexer/Token.cs (1)
  • Token (74-81)
Cutout/Renderer/Renderer.cs (2)
Cutout/Parser/Syntax.cs (2)
  • Syntax (5-5)
  • TryGetLeadingWhitespace (16-33)
Cutout/Extensions/TokenListExtensions.cs (1)
  • ToString (7-28)
🪛 markdownlint-cli2 (0.17.2)
README.md

70-70: Trailing spaces
Expected: 0 or 2; Actual: 1

(MD009, no-trailing-spaces)

Cutout/README.md

54-54: Trailing spaces
Expected: 0 or 2; Actual: 1

(MD009, no-trailing-spaces)

🪛 LanguageTool
Cutout/README.md

[grammar] ~54-~54: There might be a problem here.
Context: ... or {@ and }} or %} is treated as a string literal. Any valid C# expression can be used in t...

(QB_NEW_EN_MERGED_MATCH)


[grammar] ~58-~58: Use correct spacing
Context: ... is not working, the compiler will tell you. It supports the whitespace control chara...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~60-~60: There might be a mistake here.
Context: ...pace control characters - the same as liquid. One deviation from liquid is that ther...

(QB_NEW_EN_OTHER)


[grammar] ~62-~62: Use correct spacing
Context: ... is that there is only one end keyword, {% end %}. The following keywords are supported, #...

(QB_NEW_EN_OTHER_ERROR_IDS_5)

🔇 Additional comments (79)
Cutout.Tests/IfStatementTests.Case3a#Test.Test.wsr.g.verified.cs (1)

15-15: LGTM! Consistent formatting from new parser.

The spacing inside parentheses aligns with the new parser's code generation style. Since this is an auto-generated verification file, these changes correctly reflect the expected output format from the rebuilt parser.

Also applies to: 19-19

Common.Release.props (1)

2-34: LGTM! Excellent formatting improvements.

The comprehensive formatting cleanup improves readability and consistency across the MSBuild properties file. The XML comment correction and consistent indentation enhance maintainability without any functional impact.

Common.Benchmark.props (2)

6-23: LGTM! Consistent formatting improvements.

The formatting cleanup enhances readability and aligns with the standardization effort across the MSBuild files.


16-16: Microsoft.CodeAnalysis.CSharp version 4.14.0 confirmed up-to-date and secure

Verified on NuGet that 4.14.0 is the latest stable release as of August 2025 and there are no known security vulnerabilities in this version. Approving the addition.

  • File: Common.Benchmark.props, Line 16
Common.Docs.props (2)

6-8: Good modernization to .NET 8.0.

Setting the target framework to net8.0 modernizes the documentation build process and aligns with current .NET standards.


9-48: LGTM! Improved formatting and structure.

The formatting improvements to the BuildDocs and ExecuteBuildDocs targets enhance readability while preserving all functional logic and error handling.

Cutout.Tests/IfStatementTests.Case3a#Test.Test.g.verified.cs (1)

15-15: LGTM! Consistent formatting across verification files.

The spacing inside parentheses matches the pattern in other verification files, demonstrating consistent code generation from the rebuilt parser.

Also applies to: 19-19

Cutout.Tests/IfStatementTests.Case2a#Test.Test.g.verified.cs (1)

15-15: Whitespace-only update looks good

Only spacing around the comparison operator changed. Behaviour is unaffected.

Cutout.Tests/TemplateTests.Case5a#Test.TestWithConstantTemplate.g.verified.cs (1)

18-18: Consistent spacing – no issues

Matches the new code-style adopted across the regenerated templates.

Cutout.Tests/TemplateTests.Case2a#Test.TestWithParameter.g.verified.cs (1)

16-16: Style update acknowledged

Extra spaces around the argument keep formatting consistent with other generated files.

Cutout.Tests/IfStatementTests.Case1a#Test.Test.wsr.g.verified.cs (1)

15-15: Whitespace tweak accepted

No functional change; snapshot remains correct.

Cutout/Extensions/SyntaxExtensions.cs (1)

5-5: No remaining using Cutout.Extensions directives found
Ran a project-wide search for using Cutout.Extensions and got zero matches, so the move of SyntaxExtensions into the root Cutout namespace won’t break any internal references. No further action needed.

Cutout.Tests/ForStatementTests.Case2a#Test.Test.g.verified.cs (1)

16-20: Whitespace normalization is OK

The added spaces around i and the argument list simply improve readability and match the new style used across the regenerated templates.

Cutout.Tests/IfStatementTests.Case2a#Test.Test.wsr.g.verified.cs (1)

15-15: Consistent spacing applied

The conditional now follows the updated spacing convention; no functional impact.

Cutout.Tests/TemplateTests.Case5a#Test.TestWithConstantTemplate.wsr.g.verified.cs (1)

19-19: Style-only tweak acknowledged

Spacing inside the if parentheses has been aligned with the rest of the regenerated code.

Cutout/Extensions/IndentedTextWriterExtensions.cs (1)

3-3: Namespace consolidation – verify call-sites

IndentedTextWriterExtensions moved from Cutout.Extensions to the root Cutout namespace.
Because the type is internal, only in-assembly references are affected, but make sure all using Cutout.Extensions; statements (if any) were removed or updated to prevent dead usings and compile errors.

Cutout.Tests/CallStatementTests.Case1a#Test.Test.g.verified.cs (1)

16-16: Whitespace change looks fine

The added space after the comma is harmless and matches the formatter’s output across other generated tests.

Cutout.Tests/CallStatementTests.Case1a#Test.Test.wsr.g.verified.cs (1)

16-16: Whitespace-only update looks good

No functional impact; spacing now follows the prevailing formatting used elsewhere in the generated sources.

Cutout.Tests/ForStatementTests.Case1a#Test.Test.g.verified.cs (1)

16-16: Whitespace formatting change in generated code.

The extra spaces around the loop variable (var tag) and before the closing parenthesis are consistent with the new parser's formatting approach. Since this is a verified test file, this formatting represents the expected output of the rebuilt parser.

Cutout.Tests/ForStatementTests.Case1a#Test.Test.wsr.g.verified.cs (1)

16-16: Consistent whitespace formatting across test variants.

The spacing change matches the previous verification file, confirming the new parser's consistent formatting approach across different test scenarios (including whitespace-sensitive cases).

Cutout.Tests/Cutout.Tests.csproj (1)

2-21: XML formatting standardization.

The changes are purely formatting improvements for consistent indentation and readability. No functional modifications to project references, packages, or properties.

Cutout.Tests/TemplateTests.Case4a#Test.TestWithModel.g.verified.cs (1)

16-18: Consistent spacing in method calls.

The extra spaces inside parentheses (builder.Append( model.Value )) align with the new parser's formatting conventions seen across other verification files. This consistent spacing pattern indicates deliberate formatting choices in the rebuilt parser.

Directory.Packages.props (2)

5-8: Improved XML element formatting.

The multi-line formatting of the Import element with each attribute on its own line enhances readability and maintainability without changing functionality.


10-10: Minor formatting cleanup.

Whitespace normalization at the end of the file.

Cutout.Sample/Cutout.Sample.csproj (1)

2-13: Formatting-only change looks good

Indentation is now consistent with other project files.

Common.Test.props (1)

2-36: Props file re-indentation approved

Whitespace normalisation only; no behavioural impact.

Cutout.Tests/ForStatementTests.Case3a#Test.Test.g.verified.cs (1)

16-29: No functional changes detected

Spaces inside parentheses/operators improve readability; generated snapshot still serves its purpose.

Cutout.Tests/CallStatementTests.cs (3)

10-10: LGTM! Syntax migration from {{ }} to {% %} delimiters.

The template syntax update for call statements aligns with the broader templating system refactor to use Liquid-style delimiters.


23-23: LGTM! Consistent syntax update for call statements.

The change maintains consistency with the new {% %} delimiter format for call statements throughout the codebase.


53-53: LGTM! Test verification string updated correctly.

The test string properly reflects the new template syntax, ensuring test coverage remains valid after the syntax migration.

Cutout/GlobalUsings.cs (1)

1-2: LGTM! Clean global using aliases for improved code readability.

The SyntaxList and TokenList aliases provide a clean abstraction that will reduce verbosity throughout the codebase when working with these commonly used generic list types. This aligns well with the token-based parsing architecture refactor.

Cutout.Tests/ForStatementTests.Case2a#Test.Test.wsr.g.verified.cs (1)

16-16: Minor formatting improvements in auto-generated code.

The added spaces inside parentheses improve readability slightly. Since this is a verified test output file, these changes likely reflect updates to the code generation formatting logic.

Also applies to: 24-25

Cutout.Tests/ForStatementTests.Case3a#Test.Test.wsr.g.verified.cs (1)

16-16: Consistent formatting improvements in auto-generated code.

The formatting changes (added spaces inside parentheses and around operators) are consistent with the other verified test files and improve code readability without affecting the logic.

Also applies to: 18-18, 22-22, 32-33

Cutout/TemplateSourceGenerator.TemplateMethodImpl.cs (1)

96-101: LGTM! Enhanced context-aware syntax rendering.

The introduction of lastSyntax tracking and passing it to WriteSyntax along with the template context enables better whitespace handling and context-aware rendering. This is a clean implementation that maintains state properly during iteration over syntax nodes.

Cutout/Exceptions/ParseException.cs (2)

5-6: LGTM: Appropriate suppression of analyzer warnings.

The suppression attributes are justified since this is an internal exception class that doesn't need the standard exception constructor patterns, and the non-public visibility is intentional for the internal API.


21-24: Good enhancement for EOF error handling.

The special case for EOF tokens provides cleaner error messages when parsing reaches the end of input unexpectedly. This improves the user experience by avoiding confusing position information for EOF tokens.

Cutout.Tests/TemplateTests.cs (2)

35-39: LGTM: Consistent syntax migration to Liquid-style delimiters.

The template syntax has been properly updated from {{ ... }} to {% ... %} delimiters with appropriate whitespace suppression markers (-). The changes are consistent and maintain the same logical structure while aligning with the new parser implementation.


149-153: LGTM: Test expectations updated to match new syntax.

The verification test correctly reflects the new template syntax in the expected generated source code, maintaining consistency with the actual template changes.

Cutout/TemplateAttributeParts.cs (1)

25-27: LGTM: Improved tokenization pipeline with whitespace suppression.

The changes implement a clean three-stage pipeline:

  1. Tokenization using static method (more appropriate for stateless operation)
  2. Whitespace suppression processing
  3. Parsing with both tokens and original template context

This architecture provides better separation of concerns and supports the new whitespace control features.

Cutout.Tests/IfStatementTests.cs (2)

11-13: LGTM: Comprehensive conditional syntax migration.

All conditional statement templates have been consistently updated to use the new {% ... %} syntax with proper whitespace suppression markers. The changes maintain logical structure while adopting the Liquid-style delimiter convention.

Also applies to: 20-24, 31-37


57-57: LGTM: Test expectations align with new syntax.

The verification test strings have been properly updated to reflect the new template syntax, ensuring test consistency with the actual implementation changes.

Also applies to: 76-76, 99-99

README.md (1)

56-129: LGTM! Documentation accurately reflects the new template syntax.

The syntax change from {{ ... }} to {% ... %} for control flow statements is well-documented and consistent throughout all examples. This aligns perfectly with the PR objective of rebuilding the parser to be more Liquid-like.

Cutout/TemplateSourceGenerator.cs (2)

11-17: LGTM! Good documentation additions.

The XML documentation comments properly describe the source generator class and its Initialize method, improving code maintainability and developer experience.


11-13: Good addition of XML summary

The summary comment improves IntelliSense and overall documentation quality. ✔️

Cutout/README.md (1)

40-113: LGTM! Documentation accurately reflects the new template syntax.

The documentation properly shows the updated syntax using {% ... %} delimiters for control flow statements, which aligns with the parser refactoring objectives. All examples are consistent and clear.

Cutout.Tests/ForStatementTests.cs (2)

17-59: LGTM! Test templates properly updated for new syntax.

The template strings have been consistently updated to use the new {% ... %} delimiters with whitespace trimming markers ({%- ... -%}). This provides good test coverage for the new Liquid-like syntax and whitespace control features.


104-146: LGTM! Template verification tests updated correctly.

The test templates in the verification methods have been properly updated to use the new {% ... %} syntax, ensuring the source generation produces correct output with the new parser.

Parent.Directory.Packages.props (1)

15-80: Compatibility verified – no direct references to updated packages found

A repository-wide search for common usage patterns of the bumped packages turned up no hits:

  • No [Fact] or [Theory] attributes (xunit.v3)
  • No [Benchmark] attributes (BenchmarkDotNet)
  • No using System.Composition imports
  • No using Polyfill imports

These version upgrades appear safe as there are no direct code dependencies on those APIs. Feel free to merge.

Cutout/Lexer/Lexer.ApplyWhitespaceSuppression.cs (3)

5-36: LGTM! Well-structured whitespace suppression logic.

The method correctly handles whitespace suppression tokens by removing preceding whitespace/newline when entering suppression and skipping trailing whitespace/newline when exiting.


38-49: Clean implementation of leading whitespace removal.

The method correctly handles the removal order (whitespace first, then newline) and properly manages the index decrement.


51-63: Correct implementation of trailing whitespace skipping.

The method properly advances the index past whitespace and newline tokens with appropriate bounds checking.

Cutout/Parser/Parser.Context.cs (4)

36-36: Standard enumerator pattern implementation.

The direct array access without bounds checking is acceptable here as it follows the standard enumerator pattern where callers must check MoveNext() before accessing Current.


38-46: Well-designed error handling methods.

The failure methods provide good context for parse errors by leveraging token position information.


60-65: Efficient context reuse implementation.

The Reset method enables object reuse which can improve performance by reducing allocations during parsing.


22-31: Consider adding bounds checking in MoveNext().

While the logic is correct, consider validating that Index + 1 < Tokens.Count before incrementing to provide clearer error messages if iteration goes beyond bounds.

 public bool MoveNext()
 {
     if (Index + 1 >= Tokens.Count)
     {
         return false;
     }
 
     Index++;
     return true;
 }

Likely an incorrect or invalid review comment.

Cutout.Tests/LexerTests.cs (3)

5-42: Comprehensive whitespace tokenization test.

The test thoroughly validates whitespace and newline tokenization including positions and spans.


44-108: Well-structured render block tokenization test.

The test provides good coverage of token types, positions, and content for render blocks.


459-486: Excellent test coverage for whitespace suppression scenarios.

The tests effectively validate the lexer's whitespace suppression behavior across different scenarios including leading/trailing spaces and mixed code/render blocks.

Cutout/Parser/Parser.ParseBlock.cs (1)

51-102: Well-implemented block parsing with comprehensive validation.

The method correctly validates block boundaries, prevents mixing of render/code tokens, disallows nested blocks, and ensures required tokens are present with clear error messages.

Cutout/Lexer/Lexer.Context.cs (5)

14-24: TryAdvance bounds checking is correct.

The method properly validates bounds before advancing the cursor position.


28-37: Correct newline handling for both Unix and Windows.

The method properly handles different newline conventions and correctly updates line/column positions.


41-46: Safe peek implementation with proper bounds handling.

The method correctly returns available characters without exceeding template bounds.


59-70: Proper initialization for text position tracking.

The reset methods correctly initialize positions following standard conventions (1-based lines, 0-based columns, -1 index for pre-iteration state).


72-72: Standard enumerator pattern for Current property.

Direct array access is appropriate here as callers are expected to check MoveNext() before accessing Current.

Cutout/Lexer/Token.cs (2)

137-148: No changes needed to ToSpan length calculation
The End.Offset is constructed as an inclusive position (Offset + count - 1), so the + 1 in endToken.End.Offset - Start.Offset + 1 correctly computes the span length. Any negative result can only occur for the EOF token (which is handled by the Type == TokenType.Eof check), so the Math.Max(..., 0) safeguard will not mask real bugs in normal token construction.


74-81: Potential off-by-one error in column calculation

The ToToken method adds count - 1 to both column and offset for the end position, but this appears incorrect for multi-character tokens. For a token starting at column 0 with count 3, the end column should be 2 (covering columns 0, 1, 2), but the current calculation would only cover columns 0-1.

Apply this fix:

 public Token ToToken(TokenType type, int count)
 {
     return new Token(
         this,
-        new CharPosition(Line, Column + count - 1, Offset + count - 1),
+        new CharPosition(Line, Column + count, Offset + count),
         type
     );
 }

However, verify this against the lexer's usage pattern in Lexer.Context.cs to ensure consistency.

Likely an incorrect or invalid review comment.

Cutout/Lexer/Lexer.cs (1)

43-59: LGTM! Clean token collection logic

The FlushCollected method correctly tracks the start position and count of collected characters, creating appropriate whitespace or raw tokens based on the isWhitespace flag.

Cutout.Tests/ParserTests.cs (1)

776-815: Well-structured nested conditional test

This test thoroughly validates complex nested if-else structures with proper assertions on the syntax tree. The use of multi-line strings and structured assertions makes the test readable and maintainable.

Cutout/Parser/Parser.cs (1)

364-377: No off-by-one here – backtracking is intentional
The context.Index-- on line 375 deliberately steps back so the non-raw token that broke out of the loop can be reprocessed by the caller. This is a common recursive-descent parsing pattern, and in this context it correctly yields all raw tokens from start up to (but not including) the first non-raw token. No change required.

Cutout/Parser/Syntax.cs (1)

36-71: Clean syntax tree design with proper encapsulation

The refactoring to use TokenList throughout the syntax tree provides better integration with the lexer/parser pipeline. The record-based approach with proper inheritance hierarchy makes the code maintainable and type-safe.

Cutout/Renderer/Renderer.cs (9)

7-13: Method signature updates align with token-based architecture

The addition of template and lastSyntax parameters supports the new token-based rendering system, enabling context-aware string conversion and improved whitespace handling.


17-19: Correct usage of template-aware ToString method

The change to varStatement.Assignment.ToString(template) properly utilizes the new token-based string conversion.


20-28: Enhanced whitespace preservation logic

The new implementation correctly extracts trailing whitespace from the preceding RawText node to maintain formatting. The use of pattern matching and null-conditional operators is appropriate.


29-56: Comprehensive if-else chain generation

The implementation correctly handles the full if-elseif-else chain structure. Good use of null-coalescing operator for defensive programming with ifStatement.ElseIfs ?? [].


58-81: Consistent template parameter propagation

All conditional statement types (for, foreach, while) consistently receive the template parameter for token-based rendering.


91-101: Proper template parameter handling for content nodes

RawText and RenderableExpression correctly receive the template parameter for content rendering.


104-119: Improved raw text rendering with newline detection

The method correctly uses the template-aware ToString and leverages the ContainsNewLine property for cleaner whitespace handling.


146-169: Correct implementation with forward-looking syntax logic

The template parameter is properly propagated. The lastSyntax parameter cleverly looks ahead to the next syntax node (line 163) to enable whitespace extraction from subsequent RawText nodes.


186-207: Refactored call statement generation with explicit parameter handling

The new implementation clearly separates the builder parameter from user parameters and properly handles optional whitespace. The conditional whitespace concatenation logic (line 202) correctly adapts based on the includeWhitespaceReceiver flag.

Comment thread .globalconfig
Comment thread Cutout.Tests/IfStatementTests.Case1a#Test.Test.g.verified.cs
Comment thread Cutout.Tests/LexerTests.cs Outdated
Comment thread Cutout.Tests/ParserTests.cs
Comment thread Cutout/Parser/Parser.ParseBlock.cs
Comment thread Cutout/Parser/Syntax.cs
Comment thread Cutout/README.md
Comment thread Cutout/TemplateSourceGenerator.cs
Comment thread README.md

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (6)
.globalconfig (1)

26-27: Severity for S2699 is still too low
Same concern raised previously: missing assertions indicate broken tests and should not be a mere suggestion.

Cutout/TemplateSourceGenerator.cs (1)

19-19: Nit: <inheritdoc /> alone adds little value.

Either omit the tag (the interface already documents the method) or add a brief sentence on what the generator registers to make the comment more informative.

Cutout/Parser/Parser.ParseBlock.cs (1)

14-32: Previous concerns have been properly addressed.

The implementation now includes:

  • Proper bounds checking in the Identifier property to prevent index out of range exceptions
  • Comprehensive validation in RemainingTokens() method checking all edge cases

The BlockContext class is well-designed for tracking block parsing state.

Also applies to: 43-61

Cutout/Parser/Parser.cs (1)

250-275: Call statement parsing significantly improved.

The implementation now properly handles:

  • Nested parentheses with depth tracking
  • Commas within quoted strings
  • Escaped quotes in string literals

This addresses the previous concern about fragile parsing logic.

Also applies to: 277-337

Cutout.Tests/ParserTests.cs (1)

816-842: Test coverage for unmatched delimiters has been added as suggested.

Cases 40 and 41 now test unmatched opening delimiters '{{-' and '{%-', addressing the previous review comment. The test suite provides comprehensive coverage of parser functionality.

Cutout/Parser/Syntax.cs (1)

16-32: Array bounds validation properly implemented.

The TryGetLeadingWhitespace method now correctly validates count < 2 before accessing Value[count - 2], addressing the previous review concern about potential out-of-bounds access.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 42d85f5 and 8144e29.

📒 Files selected for processing (10)
  • .globalconfig (2 hunks)
  • Cutout.Tests/LexerTests.cs (1 hunks)
  • Cutout.Tests/ParserTests.cs (1 hunks)
  • Cutout/Cutout.csproj (1 hunks)
  • Cutout/Extensions/TokenListExtensions.cs (1 hunks)
  • Cutout/Parser/Parser.Context.cs (1 hunks)
  • Cutout/Parser/Parser.ParseBlock.cs (1 hunks)
  • Cutout/Parser/Parser.cs (1 hunks)
  • Cutout/Parser/Syntax.cs (1 hunks)
  • Cutout/TemplateSourceGenerator.cs (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: stringbuilder.append(readonlyspan) is not available in .net standard 2.0, only in .net standar...
Learnt from: bmazzarol
PR: bmazzarol/Cutout#3
File: Cutout/Extensions/TokenListExtensions.cs:7-28
Timestamp: 2025-08-02T14:41:43.246Z
Learning: StringBuilder.Append(ReadOnlySpan<char>) is not available in .NET Standard 2.0, only in .NET Standard 2.1 and later. For .NET Standard 2.0 projects, character-by-character appending or converting spans to strings first are the available options.

Applied to files:

  • Cutout/Extensions/TokenListExtensions.cs
📚 Learning: in the cutout lexer (cutout/lexer/lexer.cs), when processing windows newlines ('\r\n'), the token is...
Learnt from: bmazzarol
PR: bmazzarol/Cutout#3
File: Cutout/Lexer/Lexer.cs:75-76
Timestamp: 2025-08-02T14:42:30.024Z
Learning: In the Cutout lexer (Cutout/Lexer/Lexer.cs), when processing Windows newlines ('\r\n'), the token is correctly created with count=1 because the AdvanceNewline method handles advancing past both characters internally, and the main loop's MoveNext() continues from the proper position after the newline sequence.

Applied to files:

  • Cutout/Extensions/TokenListExtensions.cs
  • Cutout/Parser/Parser.Context.cs
  • Cutout/Parser/Parser.cs
  • Cutout/Parser/Parser.ParseBlock.cs
  • Cutout.Tests/LexerTests.cs
  • Cutout/Parser/Syntax.cs
🧬 Code Graph Analysis (3)
Cutout/Extensions/TokenListExtensions.cs (2)
Cutout/Lexer/Token.cs (1)
  • ToString (132-135)
Cutout/Lexer/Lexer.cs (1)
  • TokenList (8-66)
Cutout.Tests/ParserTests.cs (8)
Cutout/Lexer/Lexer.ApplyWhitespaceSuppression.cs (1)
  • Lexer (3-64)
Cutout/Lexer/Lexer.cs (1)
  • Lexer (3-164)
Cutout/Parser/Parser.ParseBlock.cs (1)
  • Parser (3-126)
Cutout/Parser/Parser.cs (2)
  • Parser (5-447)
  • Syntax (433-446)
Cutout/Parser/Syntax.cs (4)
  • Syntax (5-5)
  • BreakStatement (73-73)
  • ContinueStatement (80-80)
  • ReturnStatement (87-87)
Cutout/Extensions/TokenListExtensions.cs (1)
  • ToString (7-36)
Cutout/Lexer/Token.cs (3)
  • ToString (132-135)
  • ParseException (150-153)
  • Token (74-81)
Cutout/Exceptions/ParseException.cs (1)
  • ParseException (12-17)
Cutout/Parser/Syntax.cs (3)
Cutout/Parser/Parser.cs (1)
  • Syntax (433-446)
Cutout/Parser/Parser.ParseBlock.cs (1)
  • TokenList (43-61)
Cutout/Lexer/Lexer.cs (1)
  • TokenList (8-66)
🔇 Additional comments (18)
Cutout/Cutout.csproj (2)

2-16: LGTM! Past feedback has been properly addressed.

The project file changes look good:

  • Proper formatting and structure
  • TreatWarningsAsErrors correctly set to true for maintaining code quality
  • Obsolete PackageScribanIncludeSource property has been removed
  • All build-related properties are properly configured

30-32: Excellent! Dependency management issues resolved.

The package references have been properly configured:

  • Microsoft.CodeAnalysis.CSharp.Workspaces now includes PrivateAssets="all" to prevent dependency leakage
  • New Microsoft.CodeAnalysis.CSharp reference added with proper private assets configuration

This addresses the previous feedback about marking build-time dependencies as private.

Cutout/TemplateSourceGenerator.cs (1)

11-16: Good documentation and pragma directive additions.

The XML documentation for the class and the pragma directives around the [Generator] attribute are appropriate improvements for code clarity and warning management.

Cutout/Extensions/TokenListExtensions.cs (2)

7-15: Good parameter validation added.

The null/empty check for the template parameter addresses the previous feedback and provides appropriate error handling with a descriptive exception message.


22-36: Correct implementation for .NET Standard 2.0.

The character-by-character appending approach is the appropriate solution for .NET Standard 2.0, as StringBuilder.Append(ReadOnlySpan<char>) is not available in this framework version. The implementation efficiently converts the token list to a string representation using the template context.

Cutout/Parser/Parser.Context.cs (3)

9-21: Well-designed Context class with proper encapsulation.

The class properly encapsulates parser state with appropriate property access levels and initializes the enumerator index correctly to -1.


22-41: Excellent enumerator implementation with contextual error handling.

The MoveNext() method correctly implements bounds checking, and the Failure methods provide contextual error reporting by leveraging the stored template string. This will greatly help with debugging parser issues.


48-66: Appropriate code coverage exclusions and reset functionality.

The ExcludeFromCodeCoverage attributes are correctly applied to standard interface boilerplate methods, and the reset functionality enables context reuse efficiently.

Cutout.Tests/LexerTests.cs (7)

3-42: Excellent comprehensive test for whitespace tokenization.

The test thoroughly validates whitespace handling including tabs, newlines, and carriage returns with proper position tracking. The assertions verify both token types and exact character spans.


44-108: Well-structured test for render block tokenization.

The test properly validates the tokenization of {{ ... }} blocks, checking token types, positions, and content extraction. The detailed assertions ensure correct parsing of render delimiters and internal content.


110-187: Thorough validation of raw text tokenization.

The test comprehensively checks tokenization of plain text with spaces, ensuring proper separation of words and whitespace tokens with accurate position tracking.


189-406: Comprehensive mixed content test with excellent coverage.

This test validates complex scenarios combining raw text, render blocks, and various newline types (both \n and \r\n). The detailed token sequence verification ensures the lexer handles real-world template content correctly.


408-455: Good validation of whitespace suppression tokens.

The test properly verifies tokenization of {{- ... -}} syntax, ensuring whitespace suppression markers are correctly identified as distinct token types.


457-477: Excellent integration test for whitespace suppression functionality.

These tests not only verify tokenization but also validate the ApplyWhitespaceSuppression method's behavior, ensuring the complete whitespace suppression pipeline works correctly with both leading and trailing whitespace scenarios.


479-486: Good coverage of mixed block types.

The final test ensures proper tokenization when code blocks ({% ... %}) and render blocks ({{ ... }}) are used together, validating the lexer's ability to distinguish between different block types.

Cutout/Parser/Parser.ParseBlock.cs (1)

74-125: Well-structured block parsing implementation.

The method correctly:

  • Validates matching block enter/exit token types
  • Prevents nested code/render blocks
  • Tracks the identifier token and provides clear error messages
  • Handles all edge cases (empty blocks, missing exit tokens)
Cutout.Tests/ParserTests.cs (1)

844-872: Excellent test coverage for complex call statement parsing.

The tests properly validate the improved call statement parser's ability to handle:

  • Commas within quoted strings (Case42)
  • Nested parentheses with quoted parameters (Case43)

This confirms the robustness of the new parsing implementation.

Cutout/Parser/Syntax.cs (1)

7-98: Well-designed token-based syntax model.

The refactoring to use TokenList instead of strings throughout the syntax tree is a significant improvement:

  • Preserves original token information and positions
  • Enables more precise error reporting and source mapping
  • Clean inheritance hierarchy with proper use of record types
  • Good encapsulation with internal visibility

Comment thread .globalconfig
Comment thread Cutout/Parser/Parser.cs
@bmazzarol bmazzarol merged commit 79f0f40 into main Aug 2, 2025
1 check passed
@bmazzarol bmazzarol deleted the feat/new-parser branch August 2, 2025 15:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant