[editorial] Use canonical link into semconv by chalin · Pull Request #4554 · open-telemetry/opentelemetry-specification

chalin · 2025-06-12T18:51:43Z

Addresses OTel.io link-check failure reported in Update opentelemetry-specification version to v1.46.0 opentelemetry.io#6952 (comment)
I'm not sure how to avoid this in the future. /cc @trask

chalin · 2025-06-12T20:54:27Z

Can someone rerun the checks, now that NPM is up again.

@chalin

Related to @chalin's #4554 We could enforce no redirects via lychee's `max_redirects = 0` configuration, but we'd need to make a few exclusions for that to work, and it would only make our link check failures even more common as 3rd party sites move things around. Probably a better option to address @chalin's specific ask would be to have a separate lychee run with `max_redirects = 0` that only checks https://opentelemetry.io links. Most of this was done using: <details> <summary>python script</summary> ``` import os import re import requests import concurrent.futures def update_links_in_file(filepath): with open(filepath, 'r', encoding='utf-8') as f: content = f.read() def replacer(match): url = match.group(2) new_url = get_redirect_url(url) if new_url and new_url != url: return f'[{match.group(1)}]({new_url})' return match.group(0) def replacer_ref(match): url = match.group(2) new_url = get_redirect_url(url) if new_url and new_url != url: return f'[{match.group(1)}]: {new_url}' return match.group(0) def replacer_html(match): url = match.group(1) new_url = get_redirect_url(url) if new_url and new_url != url: return f'href="{new_url}"' return match.group(0) # Markdown link: [text](https://...) pattern = re.compile( r'\[' # opening square bracket for the text r'([^]]+)' # group 1: text r']' # closing square bracket for the text r'$' # opening parenthesis for the URL r'(https://[^)]+)' # group 2: URL r'$' # closing parenthesis for the URL ) new_content = pattern.sub(replacer, content) # Markdown link: reference-style [label]: https://... pattern = re.compile( r'\[' # opening square bracket for the ref r'([^]]+)' # group 1: ref r']: ' # closing square bracket for the ref r'(https://.*)' # group 2: URL ) new_content = pattern.sub(replacer_ref, new_content) # Markdown link: html pattern = re.compile( r'href="' r'(https://[^"]+)' r'"' ) new_content = pattern.sub(replacer_html, new_content) if new_content != content: with open(filepath, 'w', encoding='utf-8') as f: f.write(new_content) print(f'Updated: {filepath}') def get_redirect_url(url): if url.startswith('https://cloud-native.slack.com/archives/'): # keep these short links as they are return None try: resp = requests.head(url, allow_redirects=True, timeout=5) if resp.history and resp.status_code == 200: for r in resp.history: if r.status_code == 301 or r.status_code == 302: if resp.url.startswith('https://en.wikipedia.org'): return resp.url.replace('https://en.wikipedia.org', 'https://wikipedia.org') if resp.url.startswith('https://github.com/login?return_to') or resp.url.startswith('https://accounts.google.com/v3/signin/'): # this link requires authentication, so we can't do anything with it return None if resp.url.startswith('http://arxiv.org'): return resp.url.replace('http://arxiv.org', 'https://arxiv.org') if resp.url.startswith('https://pkg.go.dev/'): # no need for this query parameter return re.sub(r'\?utm_source=godoc(?=#|$)', '', resp.url) return resp.url except Exception: pass return None filepaths = [] for dirpath, _, filenames in os.walk('.'): if 'node_modules' in dirpath.split(os.sep): continue for filename in filenames: if filename == 'CHANGELOG.md': continue if filename.endswith('.md'): filepaths.append(os.path.join(dirpath, filename)) # update_links_in_file(os.path.join(dirpath, filename)) with concurrent.futures.ThreadPoolExecutor() as executor: executor.map(update_links_in_file, filepaths) ``` </details> --------- Co-authored-by: Carlos Alberto Cortez <calberto.cortez@gmail.com>

@chalin

Resolves @chalin's #4554 "how to avoid this in the future" --------- Co-authored-by: Patrice Chalin <chalin@users.noreply.github.com>

[editorial] Use canonical link into semconv

9c23932

chalin requested review from a team June 12, 2025 18:51

chalin mentioned this pull request Jun 12, 2025

Update opentelemetry-specification version to v1.46.0 open-telemetry/opentelemetry.io#6952

Merged

lmolkova approved these changes Jun 12, 2025

View reviewed changes

reyang enabled auto-merge June 12, 2025 22:26

reyang approved these changes Jun 12, 2025

View reviewed changes

reyang added this pull request to the merge queue Jun 12, 2025

Merged via the queue into open-telemetry:main with commit 8f01d12 Jun 12, 2025
8 of 11 checks passed

chalin deleted the chalin-im-use-canonical-urls-2025-06-12 branch June 12, 2025 22:49

trask mentioned this pull request Jun 13, 2025

Replace redirected urls with their targets #4557

Merged

trask mentioned this pull request Jun 19, 2025

Check that opentelemetry.io links are canonical #4564

Merged

github-merge-queue Bot pushed a commit that referenced this pull request Jun 20, 2025

Check that opentelemetry.io links are canonical (#4564)

f324fdb

Resolves @chalin's #4554 "how to avoid this in the future" --------- Co-authored-by: Patrice Chalin <chalin@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[editorial] Use canonical link into semconv#4554

[editorial] Use canonical link into semconv#4554
reyang merged 1 commit intoopen-telemetry:mainfrom
chalin:chalin-im-use-canonical-urls-2025-06-12

chalin commented Jun 12, 2025

Uh oh!

chalin commented Jun 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

chalin commented Jun 12, 2025

Uh oh!

chalin commented Jun 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants