Sandbox Branch Bug Fix Report

Branch: sandbox (flink-reactor-dsl) Period: March 15–16, 2026 Author: Automated analysis from 22 commits

1. Executive Summary

The sandbox branch contains 22 commits that introduced a two-tier SQL testing model and fixed 14 numbered bugs (BUG-001 through BUG-014) discovered by running generated SQL through Flink's EXPLAIN endpoint against a live cluster.

Metric	Count
Total commits	22
Numbered bug fixes	14
Supporting commits (infra, reports, schema)	8
Primary files modified	`sql-generator.ts`, `schema-introspect.ts`
Examples fixed	25+ (out of 34 total)

Risk distribution:

Risk Level	Bugs	Description
🔴 High	BUG-004, BUG-008, BUG-009, BUG-010	Major codegen rewrites affecting multiple node types
🟡 Medium	BUG-001, BUG-002, BUG-003, BUG-006, BUG-007, BUG-012, BUG-014	Targeted fixes with moderate blast radius
🟢 Low	BUG-005, BUG-011, BUG-013	Example-level or narrow fixes

2. Test Infrastructure

Two-Tier Testing Model

Introduced in commit 85016b8:

Tier	Type	Infrastructure	Purpose
Tier 1	Snapshot tests	None (pure unit)	Fast regression on SQL output shape
Tier 2	EXPLAIN integration	Live Flink cluster via SQL Gateway	Validates SQL is syntactically and semantically valid

How EXPLAIN Tests Work

DSL examples (34 total: 25 existing + 9 sandbox-derived 30–38) are synthesized to SQL
Each generated SQL statement is sent to the Flink SQL Gateway EXPLAIN endpoint
Flink parses, validates types, and returns an execution plan — or errors
Tier 2 skips gracefully when no cluster is available

Docker Cluster Requirements

The EXPLAIN tests require a running Flink cluster with SQL Gateway. The flink-reactor-dsl local dev cluster provides:

Flink JobManager + TaskManager
SQL Gateway endpoint
Kafka (for connector validation)
PostgreSQL with TimescaleDB extension

Example Coverage

34 examples total (directories 01–28, 30–38):

Examples 09, 10, 17, 29 are absent (removed or never created)
Examples 30–38 are new sandbox-derived examples covering: FlatMap/UNNEST, TopN ranking, Union streams, Rename fields, Drop fields, Cast types, Coalesce defaults, Add computed field, Dedup aggregate

3. Bug Fix Details

BUG-001: Sink schema type mismatch for windowed aggregates

Field	Value
Commit	`d9d29f1`
Risk	🟡 Medium
Affected examples	05, 12, 14
Root cause	Backward walk in sink schema resolution treated Window nodes with Aggregate children as starting points. Since Aggregate has no source children in JSX patterns, type inference fell back to `STRING` for all columns.
Fix	Use `findDeepestSource()` to skip intermediate Windows. Recognize Join nodes as valid starting points. Add `FIRST_VALUE`/`LAST_VALUE` to type-preserving aggregate list.
Files	`schema-introspect.ts`, `sql-generator.ts`

BUG-002: Joins generating `SELECT * FROM unknown` in DML

Field	Value
Commit	`45262b8`
Risk	🟡 Medium
Affected examples	06, 08, 15, 16, 18, 21, 25
Root cause	`buildSiblingChainQuery` and `resolveRouteUpstream` only recognized `Source`, `Transform`, and `Window` kinds. Join nodes have their own `"Join"` kind and were silently skipped, causing downstream transforms to emit `SELECT * FROM unknown`.
Fix	Add `"Join"` to all kind-check arrays in the sibling chain and route upstream logic.
Files	`sql-generator.ts`

BUG-003: Malformed `TABLE()` clause in windowed queries

Field	Value
Commit	`e926fab`
Risk	🟡 Medium
Affected examples	11, 19, 38
Root cause	Flink TVF syntax requires `TABLE <identifier>`, not `TABLE (<subquery>)`. When the window's upstream is a subquery (from Deduplicate or Union), the generated SQL was invalid.
Fix	Detect subquery upstreams and lift them into a CTE: `WITH _windowed_input AS (<subquery>) SELECT ... FROM TABLE(TUMBLE(TABLE _windowed_input, ...))`.
Files	`sql-generator.ts`

BUG-004: Filesystem sink partition syntax and missing schemas

Field	Value
Commit	`f946809`
Risk	🔴 High
Affected examples	23, 24, 25
Root cause	Three separate issues: (1) `findRouteUpstreamNode` didn't recognize `"Join"` kind, (2) function expressions in `partitionBy` (e.g. `DATE(col)`) were emitted as raw expressions in `PARTITIONED BY`, (3) DML lacked computed partition projections.
Fix	(1) Add `"Join"` to route upstream. (2) Generate physical partition columns in DDL for function-based partitions. (3) Wrap DML with computed partition column projections. Largest single fix at 126+ lines added to `sql-generator.ts`.
Files	`sql-generator.ts`, `schema-introspect.ts`

BUG-005: FlatMap UNNEST alias mismatch

Field	Value
Commit	`ef718c7`
Risk	🟢 Low
Affected examples	30
Root cause	Source schema used `ARRAY<STRING>` but FlatMap declared 3 output columns. `UNNEST` of `ARRAY<STRING>` produces only 1 column.
Fix	Change `line_items` type to `ARRAY<ROW<product_id STRING, quantity INT, price DECIMAL(10,2)>>` so `UNNEST` produces a structured row matching the 3 aliases.
Files	`examples/30-flatmap-unnest/after.tsx`

BUG-006: Session window timestamp arithmetic and changelog mode

Field	Value
Commit	`af3413e`
Risk	🟡 Medium
Affected examples	12
Root cause	(1) Direct `TIMESTAMP` subtraction not supported in Flink, (2) session windows can merge on late events (retract), but `propagateChangelogMode` didn't mark them, (3) `propagatePrimaryKey` didn't extract `groupBy` from Aggregate children inside Window nodes.
Fix	(1) Use `TIMESTAMPDIFF(SECOND, ...)`. (2) Mark SessionWindow as retract so JdbcSink gets `PRIMARY KEY`. (3) Extract `groupBy` for PK propagation. (4) Add `TIMESTAMPDIFF → INT` to `inferExpressionType`.
Files	`sql-generator.ts`, `schema-introspect.ts`, `examples/12-session-window/after.tsx`

BUG-007: MatchRecognize excluded from sibling chain

Field	Value
Commit	`f14be33`
Risk	🟡 Medium
Affected examples	15
Root cause	CEP kind was missing from all sibling chain filters, backward walk conditions, and route upstream resolution in both `sql-generator.ts` and `schema-introspect.ts`. Also, `COUNT(B.*)` syntax is not supported by Flink in `MATCH_RECOGNIZE`.
Fix	Add `"CEP"` kind to all node-type checks. Fix `COUNT(B.*)` → `COUNT(B.transaction_id)` in example 15.
Files	`sql-generator.ts`, `schema-introspect.ts`, `examples/15-cep-fraud-detection/after.tsx`

BUG-008: Window inside Route branch can't find upstream

Field	Value
Commit	`d7bf08d`
Risk	🔴 High
Affected examples	21, 28
Root cause	(1) `buildBranchQuery` replaced Window children instead of appending VirtualRef alongside them. (2) `resolveRouteUpstream` only referenced the nearest sibling, not the full chain. (3) Route handler didn't do a backward walk for schema resolution.
Fix	Major refactor of route branch handling: VirtualRef injection, full sibling chain build for route upstream, backward walk pattern for schema resolution. 113+ lines changed in `sql-generator.ts`.
Files	`sql-generator.ts`

BUG-009: Chained joins referencing intermediate node IDs as tables

Field	Value
Commit	`dfdae6a`
Risk	🔴 High
Affected examples	08
Root cause	Join operands used `resolveRef` which only works for Source nodes with table names. Intermediate join results have node IDs, not table names, producing invalid SQL like `FROM join_abc123`.
Fix	Add `resolveJoinOperand()` helper that detects non-Source join operands and inlines their SQL as CTEs. Updated all 4 join builders (`buildJoinQuery`, `buildTemporalJoinQuery`, `buildLookupJoinQuery`, `buildIntervalJoinQuery`).
Files	`sql-generator.ts`, `examples/08-multi-stream-join/after.tsx`

BUG-010: KafkaSource with PK requires upsert-kafka connector

Field	Value
Commit	`5d68dcd`
Risk	🔴 High
Affected examples	16, 18
Root cause	KafkaSource DDL always used `'connector' = 'kafka'` regardless of whether a primary key was defined. Flink requires `upsert-kafka` connector when a PK is present with non-changelog formats.
Fix	Detect PK + non-changelog format (json, avro, csv) and switch to `upsert-kafka` with `key.format`/`value.format` properties. Exclude changelog formats (debezium-json, canal-json) from the switch.
Files	`sql-generator.ts`

BUG-011: Anti-join retract with KafkaSink missing PK

Field	Value
Commit	`535ef6a`
Risk	🟢 Low
Affected examples	18
Root cause	Anti-join detected as retract, PK propagated from left source, but sink DDL lacked a primary key, causing upsert-kafka to fail.
Fix	Add `primaryKey` to `EventSchema` so the full chain works: Source (PK + json → upsert-kafka) → Anti-join (retract, PK propagated) → Sink (retract + PK → upsert-kafka with PRIMARY KEY).
Files	`examples/18-broadcast-join/after.tsx`

BUG-012: Source nodes ignoring child transforms in `buildQuery`

Field	Value
Commit	`c2c67d0`
Risk	🟡 Medium
Affected examples	32 (Union with nested transforms)
Root cause	Source nodes (`KafkaSource`, `JdbcSource`, `GenericSource`) with children returned a bare `SELECT * FROM table`, ignoring child transforms like `<KafkaSource><Map/></KafkaSource>`.
Fix	Chain through child transforms using VirtualRef injection when source nodes have children, matching the pattern used elsewhere in the generator.
Files	`sql-generator.ts`

BUG-013: `SUM(CASE ...)` type inferred as STRING

Field	Value
Commit	`f86010f`
Risk	🟢 Low
Affected examples	20
Root cause	`inferExpressionType` used a regex that only matched `SUM(column_name)`. Complex expressions like `SUM(CASE WHEN ... END)` didn't match, falling back to `STRING`.
Fix	Add broader `/^SUM\s*\(/` test that returns `BIGINT` as a safe default when the column-name regex doesn't match (SUM always produces a numeric type).
Files	`schema-introspect.ts`

BUG-014: Partition column types and `scan.startup.mode` regression

Field	Value
Commit	`f97bd81`
Risk	🟡 Medium
Affected examples	21, 23, 25
Root cause	(1) `HOUR()`/`YEAR()`/`MONTH()`/`DAY()` functions return `BIGINT` in Flink, but `resolvePartitionExpression` typed them as `INT`. (2) BUG-010 fix regressed `scan.startup.mode` for changelog-format Kafka sources with PK.
Fix	Use `BIGINT` for all time-part partition functions. Only skip `scan.startup.mode` for actual `upsert-kafka` sources, not changelog-format sources with PK.
Files	`sql-generator.ts`

4. Supporting Commits

Commit	Description
`6008985`	SQL comment blocks: Added self-documenting comment blocks to generated SQL with type labels, separators, and metadata (PIPELINE, SOURCE TABLE, SINK TABLE, etc.)
`2666cf2`	Schema introspection fix: Window nodes append `window_start`/`window_end`, Deduplicate/TopN append `rownum BIGINT`, FlatMap appends unnested columns — prerequisite for EXPLAIN validation
`85016b8`	Two-tier test infrastructure: Snapshot tests (Tier 1) + EXPLAIN integration tests (Tier 2) with Docker cluster, 9 new examples (30–38)
`8145403`	Bug report batch 1: Documented 17 codegen issues found by EXPLAIN tests, categorized into 6 root causes (BUG-001 through BUG-006)
`db3a36a`	Bug report batch 2: Documented 8 remaining issues (BUG-007 through BUG-014)
`ae5354b`	TSX example fixes: Fixed 8 examples — table-qualified ON conditions, `TIMESTAMPDIFF` usage, replaced s3:// with /tmp/ and parquet with json for test environments
`242928c`	Join schema resolution: `resolveNodeSchema` now merges both sides' schemas for Join/IntervalJoin/TemporalJoin, deduplicating join key columns
`3e066b2`	MatchRecognize schema + final fixes: `resolveNodeSchema` builds output schema from PARTITION BY + MEASURES for CEP nodes; fixed example 25 join `SELECT *` issue

5. Cross-Cutting Regression Analysis

Pattern 1: Node-Kind Exhaustiveness

Bugs affected: BUG-002, BUG-004, BUG-007

Multiple places in the codebase filter on node kinds (Source, Transform, Window) but missed Join and CEP. Each new node kind requires updating every kind-check across sql-generator.ts and schema-introspect.ts.

Regression risk: 🔴 High — Adding a new node kind in the future will silently fail in any kind-check that isn't updated. Consider an enum-exhaustiveness approach or a centralized isChainableKind() predicate.

Pattern 2: VirtualRef Injection Consistency

Bugs affected: BUG-008, BUG-012

The VirtualRef pattern (injecting a virtual table reference into child nodes) is used in buildSiblingChainQuery, buildBranchQuery, and now source nodes. Each builder had its own variant, and BUG-008/012 showed that some were incomplete.

Regression risk: 🟡 Medium — The pattern is now applied in 3+ places. A refactor to a shared injectVirtualRef() helper would reduce drift.

Pattern 3: Connector Property Logic

Bugs affected: BUG-010, BUG-011, BUG-014

The kafka/upsert-kafka connector selection interacts with PK detection, changelog mode propagation, and scan.startup.mode. BUG-014 was a direct regression from the BUG-010 fix.

Regression risk: 🔴 High — Connector property generation has complex conditional logic. A truth-table test covering all combinations (PK × format × changelog × connector) would catch future regressions.

Pattern 4: Type Inference Fallback

Bugs affected: BUG-001, BUG-006, BUG-013, BUG-014

Type inference in schema-introspect.ts uses regex matching and has multiple fallback paths. Several bugs were caused by the final fallback returning STRING for unrecognized patterns.

Regression risk: 🟡 Medium — New aggregate functions or expressions may still fall through to STRING. Consider a stricter default (fail loudly) or a more comprehensive expression parser.

Pattern 5: Route Branch Complexity

Bugs affected: BUG-004, BUG-008, BUG-014

Route branches (multi-sink patterns) interact with window placement, partition generation, and upstream resolution. Three separate bugs surfaced in this area, each requiring different fixes.

Regression risk: 🟡 Medium — Route branch code paths are exercised by examples 21, 23, 24, 25, 28. Maintaining good example coverage for multi-sink patterns is essential.

6. CI Recommendations for `test:explain`

Option A: Nightly EXPLAIN Tests

Best for: Teams that want low CI cost and can tolerate overnight bug detection.

# .github/workflows/nightly-explain.yml
name: EXPLAIN Integration Tests
on:
  schedule:
    - cron: '0 3 * * *'  # 3 AM UTC daily
  workflow_dispatch: {}

jobs:
  explain-tests:
    runs-on: ubuntu-latest
    services:
      flink-jobmanager:
        image: flink:1.19-scala_2.12-java17
        # ... SQL Gateway config
      kafka:
        image: confluentinc/cp-kafka:7.6.0
    steps:
      - uses: actions/checkout@v4
      - run: pnpm install
      - run: pnpm test:explain
        env:
          SQL_GATEWAY_URL: http://localhost:8083

Pros: No PR latency impact, catches regressions within 24h Cons: Bugs discovered after merge, not before

Option B: Per-PR with Docker Layer Caching

Best for: Teams that want pre-merge validation and can afford ~3-5 min CI overhead.

# .github/workflows/pr-explain.yml
name: EXPLAIN Tests (PR)
on:
  pull_request:
    paths:
      - 'src/codegen/**'
      - 'src/examples/**'

jobs:
  explain-tests:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: docker/setup-buildx-action@v3
      - uses: actions/cache@v4
        with:
          path: /tmp/.docker-cache
          key: flink-cluster-${{ hashFiles('docker-compose.test.yml') }}
      - run: docker compose -f docker-compose.test.yml up -d
      - run: pnpm install
      - run: pnpm test:explain
        env:
          SQL_GATEWAY_URL: http://localhost:8083
      - run: docker compose -f docker-compose.test.yml down

Pros: Catches bugs before merge, path-scoped to minimize unnecessary runs Cons: ~3-5 min added to affected PRs

Image Optimization

Strategy	Savings
Pre-built test image with connectors	Avoid downloading JARs on every run
Slim Flink image (no UI, no metrics)	~200MB smaller
Parallel startup (Flink + Kafka)	~30s faster boot
Cached pnpm store (`actions/cache`)	~15s faster install

Recommendation

Start with Option A (nightly) to validate the approach with zero PR impact, then graduate to Option B (per-PR, path-scoped) once the cluster startup is optimized below 2 minutes.

7. Verification Checklist

Use this checklist when reviewing the sandbox branch before merge:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sandbox Branch Bug Fix Report

1. Executive Summary

2. Test Infrastructure

Two-Tier Testing Model

How EXPLAIN Tests Work

Docker Cluster Requirements

Example Coverage

3. Bug Fix Details

BUG-001: Sink schema type mismatch for windowed aggregates

BUG-002: Joins generating `SELECT * FROM unknown` in DML

BUG-003: Malformed `TABLE()` clause in windowed queries

BUG-004: Filesystem sink partition syntax and missing schemas

BUG-005: FlatMap UNNEST alias mismatch

BUG-006: Session window timestamp arithmetic and changelog mode

BUG-007: MatchRecognize excluded from sibling chain

BUG-008: Window inside Route branch can't find upstream

BUG-009: Chained joins referencing intermediate node IDs as tables

BUG-010: KafkaSource with PK requires upsert-kafka connector

BUG-011: Anti-join retract with KafkaSink missing PK

BUG-012: Source nodes ignoring child transforms in `buildQuery`

BUG-013: `SUM(CASE ...)` type inferred as STRING

BUG-014: Partition column types and `scan.startup.mode` regression

4. Supporting Commits

5. Cross-Cutting Regression Analysis

Pattern 1: Node-Kind Exhaustiveness

Pattern 2: VirtualRef Injection Consistency

Pattern 3: Connector Property Logic

Pattern 4: Type Inference Fallback

Pattern 5: Route Branch Complexity

6. CI Recommendations for `test:explain`

Option A: Nightly EXPLAIN Tests

Option B: Per-PR with Docker Layer Caching

Image Optimization

Recommendation

7. Verification Checklist

FilesExpand file tree

SANDBOX-BUGFIX-REPORT.md

Latest commit

History

SANDBOX-BUGFIX-REPORT.md

File metadata and controls

Sandbox Branch Bug Fix Report

1. Executive Summary

2. Test Infrastructure

Two-Tier Testing Model

How EXPLAIN Tests Work

Docker Cluster Requirements

Example Coverage

3. Bug Fix Details

BUG-001: Sink schema type mismatch for windowed aggregates

BUG-002: Joins generating SELECT * FROM unknown in DML

BUG-003: Malformed TABLE() clause in windowed queries

BUG-004: Filesystem sink partition syntax and missing schemas

BUG-005: FlatMap UNNEST alias mismatch

BUG-006: Session window timestamp arithmetic and changelog mode

BUG-007: MatchRecognize excluded from sibling chain

BUG-008: Window inside Route branch can't find upstream

BUG-009: Chained joins referencing intermediate node IDs as tables

BUG-010: KafkaSource with PK requires upsert-kafka connector

BUG-011: Anti-join retract with KafkaSink missing PK

BUG-012: Source nodes ignoring child transforms in buildQuery

BUG-013: SUM(CASE ...) type inferred as STRING

BUG-014: Partition column types and scan.startup.mode regression

4. Supporting Commits

5. Cross-Cutting Regression Analysis

Pattern 1: Node-Kind Exhaustiveness

Pattern 2: VirtualRef Injection Consistency

Pattern 3: Connector Property Logic

Pattern 4: Type Inference Fallback

Pattern 5: Route Branch Complexity

6. CI Recommendations for test:explain

Option A: Nightly EXPLAIN Tests

Option B: Per-PR with Docker Layer Caching

Image Optimization

Recommendation

7. Verification Checklist

BUG-002: Joins generating `SELECT * FROM unknown` in DML

BUG-003: Malformed `TABLE()` clause in windowed queries

BUG-012: Source nodes ignoring child transforms in `buildQuery`

BUG-013: `SUM(CASE ...)` type inferred as STRING

BUG-014: Partition column types and `scan.startup.mode` regression

6. CI Recommendations for `test:explain`