You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Consolidate www-equivalence into a single isSameSite predicate
Four call sites had grown independent www-handling implementations
(stripWww in to-md-urls, isWwwVariant + isSameOriginIgnoringWww in
get-page-urls, ad-hoc two-origin checks in walkAggregateLinks). Each
inlined its own scheme/port strictness, leaving the rule split across
files with no single source of truth — adding a new "same site" tweak
required remembering to update every site.
Replace all four with one predicate: isSameSite(url1, url2). Same
canonical-host comparison everywhere, scheme deliberately ignored
(http→https on the same host is a canonical upgrade), port-strict.
Behavior changes (both correctness improvements):
- getPathFilterBase now preserves the base path when origins differ
only by scheme, not just www. Previously dropped to root.
- shouldInclude / scopeUrls now accept sitemap URLs with mismatched
scheme. Real sitemaps occasionally have stale http entries; they
resolve fine after the redirect.
walkAggregateLinks still applies isSameSite twice — once against
ctx.origin and once against the effective origin — because true
cross-host redirects (e.g. example.com → docs.example.com) leave
content discoverable at two genuinely-different origins.
Net: 50 lines removed, one shared module, one rule to update.
0 commit comments