Track negotiated Markdown asset fetches with shared GA4 edge analytics#9624
Conversation
e18d305 to
792c4e0
Compare
5572cc7 to
e7c5b40
Compare
e7c5b40 to
d64e6a7
Compare
chalin
left a comment
There was a problem hiding this comment.
Copilot (Opus 4.6) review (FYI):
Summary
This PR does three main things:
- Extracts shared GA4 analytics into a reusable library (
lib/ga4-asset-fetch.ts) — refactoring the inline GA4 Measurement Protocol code previously embedded inschema-analytics.tsinto a shared module. - Adds
asset_fetchGA4 event tracking to themarkdown-negotiationedge function — for negotiated Markdown responses (GET 2xx only). - Housekeeping — makes
index.htmlcase-sensitivity consistent, adds test scaffolding, cleans up cSpell dictionaries, and updates the analytics plan document.
Consistency & Accuracy Observations
✅ Plan-to-code alignment is strong
The asset-fetch-analytics.plan.md specifies event parameters (asset_group, asset_path, asset_ext, content_type, status_code, original_path) and the code in both markdown-negotiation/index.ts and schema-analytics/index.ts faithfully sends exactly these. The plan's gating rules (GET-only, 2xx for Markdown, 2xx+3xx for schemas) are correctly implemented.
✅ Shared library extraction is clean
The refactored lib/ga4-asset-fetch.ts correctly reproduces the same logic that was previously inlined in schema-analytics.ts (cookie parsing, client ID resolution, GA4 payload construction, waitUntil-based sending). The old schema-analytics.ts on main directly called Netlify.env.get(); the new shared module wraps this in netlifyEnvGet() with a graceful fallback + warning when globalThis.Netlify is absent — making it testable under Node without the Netlify runtime.
✅ isIndexHtmlPath case-sensitivity change is intentional and correct
On main, isIndexHtmlPath used .toLowerCase().endsWith('/index.html') (case-insensitive). The PR changes it to exact-match pathname.endsWith('/index.html') (case-sensitive). The resolveMarkdownPath regex was similarly narrowed from /index\.html$/i to /index\.html$/. This aligns with the plan's statement that "path resolution is case-sensitive by design" and normal URL-path semantics. Both functions are consistent with each other.
✅ original_path conditional inclusion is correct
...(url.pathname !== assetPath ? { original_path: url.pathname } : {}),This correctly omits original_path when the request path already matches the resolved .md path (e.g., a direct /foo/index.md request — though those are filtered out earlier by shouldConsiderRequest). The compactStringParams function in the library also strips undefined values as a belt-and-suspenders measure. Test coverage confirms both cases.
✅ Removed netlify.toml headers block is justified
The removed /schemas/:version content-type header block was a fallback in case the edge function was removed. Since ensureSchemaContentType in the refactored schema-analytics/index.ts still explicitly sets application/yaml for 2xx schema responses, removing the TOML fallback is consistent. The code comment explaining the rationale has been preserved in the function itself.
⚠️ Minor observation: content_type for Markdown events will always be text/markdown
In markdown-negotiation/index.ts lines 57 and 68:
headers.set('content-type', 'text/markdown; charset=utf-8');
// ...
content_type: normalizeContentType(headers.get('content-type') ?? ''),The content-type header is set to text/markdown; charset=utf-8 on line 57, and normalizeContentType on line 68 reads from the same headers object, so content_type in the event will always be text/markdown. This is accurate — but it means the dimension is effectively constant for this asset group. Not a bug, just worth noting that this parameter carries no additional signal for Markdown events specifically. The plan doesn't require it to vary, so this is fine.
⚠️ No GA4 integration test for schema-analytics handler
The schema-analytics/index.test.ts handler integration tests don't verify that enqueueAssetFetchEvent is actually called (no waitUntil spy or mocked Netlify.env). The tests verify content-type fixup and the tracking gate thoroughly, but the actual GA4 send path is only integration-tested via live checks. This is an existing gap that predates this PR (the old code had no unit tests at all), so it's strictly an improvement — but worth noting for completeness.
✅ Test coverage is thorough and well-structured
lib/ga4-asset-fetch.test.ts: CoversnormalizeContentType,resolveClientId(no cookie, standard GA cookie, non-standard, multiple cookies),enqueueAssetFetchEvent(no-op withoutwaitUntil, full payload verification includingundefinedstripping).markdown-negotiation/analytics.test.ts: Covers GET withoriginal_path, GETindex.html, direct.mdpass-through, HEAD (no event), and markdown-unavailable fallback (no event).schema-analytics/index.test.ts: CoversensureSchemaContentType,shouldTrackSchemaFetchfor various methods/statuses/content types, and handler integration.negotiation.test.ts: CoversprefersMarkdownOverHtmlpreference logic.
✅ netlify.toml edge function registration is correct
The markdown-negotiation function on /* runs first (for all paths), and schema-analytics runs on /schemas/*. Both are correctly registered. The schema function now imports from the shared lib via relative path (../lib/ga4-asset-fetch.ts), and the re-export shim schema-analytics.ts at the edge-functions root is preserved.
✅ cSpell changes are self-consistent
chalin is moved to .cspell.yml (global word list) and removed from per-file cSpell:ignore in multiple localized sig-practices.md files. llms is added globally. The # patched comments on default_lang_commit are a known convention in this repo for localized pages.
Verdict
The PR is well-structured and internally consistent. The code faithfully implements the analytics plan. The shared library extraction is clean, test coverage is comprehensive, and the case-sensitivity change is intentional and documented. The only minor observations are that content_type is constant for Markdown events (by design) and that the schema handler integration tests don't verify the GA4 call path (an existing gap, not a regression). Overall this looks solid.

/schemas/:versionNetlify header rule so the schema Edge Function owns YAML content-type preservationasset_fetchfor negotiated Markdown responses withasset_pathand sparseoriginal_path.mdpass-through tracking out of scope and documents it as a follow-up stepindex.htmlnegotiation case-sensitive in the implementation✅ Live integration tests