Problem
The link checker in links/check.go uses HEAD requests exclusively. Some sites don't handle HEAD correctly and return 404 or 405 even though the page exists:
- crates.io:
https://crates.io/crates/ic-vetkeys returns 404 for HEAD (SPA that requires Accept: text/html), but 200 for GET. The crate exists (v0.6.0, 93k downloads).
- This is a known issue across link checkers — tools like lychee and markdown-link-check fall back to GET when HEAD fails.
Reproduction
# HEAD returns 404
curl -sI "https://crates.io/crates/ic-vetkeys" | head -1
# HTTP/2 404
# GET returns 200
curl -s -o /dev/null -w "%{http_code}" "https://crates.io/crates/ic-vetkeys"
# 200
Question
Is the HEAD-only approach intentional (e.g. to avoid bandwidth/rate-limiting concerns), or would a GET fallback when HEAD returns 404/405 be welcome? Happy to submit a PR if the fallback approach seems reasonable.
Context
Discovered while running skill-validator check on dfinity/icskills. The false positive blocks CI deployment.
Problem
The link checker in
links/check.gouses HEAD requests exclusively. Some sites don't handle HEAD correctly and return 404 or 405 even though the page exists:https://crates.io/crates/ic-vetkeysreturns 404 for HEAD (SPA that requiresAccept: text/html), but 200 for GET. The crate exists (v0.6.0, 93k downloads).Reproduction
Question
Is the HEAD-only approach intentional (e.g. to avoid bandwidth/rate-limiting concerns), or would a GET fallback when HEAD returns 404/405 be welcome? Happy to submit a PR if the fallback approach seems reasonable.
Context
Discovered while running
skill-validator checkon dfinity/icskills. The false positive blocks CI deployment.