Skip to content

Fix IPv6 Zone ID decoding to correctly handle RFC 6874 %25 separator#1653

Open
rodrigobnogueira wants to merge 1 commit intoaio-libs:masterfrom
rodrigobnogueira:fix-ipv6-zone-id
Open

Fix IPv6 Zone ID decoding to correctly handle RFC 6874 %25 separator#1653
rodrigobnogueira wants to merge 1 commit intoaio-libs:masterfrom
rodrigobnogueira:fix-ipv6-zone-id

Conversation

@rodrigobnogueira
Copy link
Copy Markdown
Member

What do these changes do?

Fixes incorrect decoding of IPv6 Zone IDs in URLs containing the RFC 6874 %25-encoded zone separator.

Background

RFC 6874 defines the format for IPv6 Zone IDs in URIs:

IPv6addrz = IPv6address "%25" ZoneID

So in http://[fe80::1%251]/, the zone ID is 1 (not 251), because %25 is the percent-encoding of %.

The bug

Two issues in yarl/_url.py:

  1. _encode_host() split the host on bare %, treating the raw-encoded zone string 251 as the zone ID instead of recognising %25 as the delimiter.
  2. .host property returned the raw (percent-encoded) value unchanged for IP addresses, so %25 was never decoded to % for the caller.

The fix

  • _encode_host(): when %25 is present in the host string, partition on %25 (the RFC 6874 separator) instead of bare %. The raw form (raw_host / str(url)) is preserved unchanged.
  • .host property: for IP addresses that contain %25, replace %25 with % before returning, so callers receive the human-readable zone identifier.

Before / After

URL .raw_host .host (before) .host (after)
http://[fe80::1%251]/ fe80::1%251 fe80::1%251 fe80::1%1
http://[fe80::1%25eth0]/ fe80::1%25eth0 fe80::1%25eth0 fe80::1%eth0

Related

This was identified as part of a security report about URL parsing inconsistencies. While this specific bug is not a practical SSRF vector (zone IDs are local-scope only), the incorrect decoding is a standard compliance issue that could cause parser disagreements with other RFC 6874-aware parsers.

Per RFC 6874, an IPv6 Zone ID in a URI is encoded as:
  IPv6addrz = IPv6address "%25" ZoneID

So in 'http://[fe80::1%251]/', the zone ID is '1', not '251'.

Previously, _encode_host() split the host on bare '%', treating '251'
as the zone ID.  The host property also returned the raw (encoded)
value unchanged for IP addresses, so %25 was never decoded.

Fix _encode_host() to partition on '%25' (RFC 6874 separator) when
present, preserving it verbatim in raw_host / str(url), and update
the host property to decode '%25' -> '%' so callers receive the
human-readable zone identifier (e.g. 'fe80::1%1' / 'fe80::1%eth0').

Tests added for:
- Numeric zone ID: http://[fe80::1%251]/  -> host='fe80::1%1'
- String zone ID:  http://[fe80::1%25eth0]/ -> host='fe80::1%eth0'
@psf-chronographer psf-chronographer bot added the bot:chronographer:provided There is a change note present in this PR label Apr 12, 2026
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq bot commented Apr 12, 2026

Merging this PR will not alter performance

✅ 99 untouched benchmarks


Comparing rodrigobnogueira:fix-ipv6-zone-id (97ac79d) with master (2f180d1)

Open in CodSpeed

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 12, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.47%. Comparing base (2f180d1) to head (97ac79d).

❌ Your project check has failed because the head coverage (97.63%) is below the target coverage (100.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1653   +/-   ##
=======================================
  Coverage   99.47%   99.47%           
=======================================
  Files          30       30           
  Lines        5942     5952   +10     
  Branches      283      285    +2     
=======================================
+ Hits         5911     5921   +10     
  Misses         22       22           
  Partials        9        9           
Flag Coverage Δ
CI-GHA 99.47% <100.00%> (+<0.01%) ⬆️
MyPy 97.63% <100.00%> (+<0.01%) ⬆️
OS-Linux 99.70% <100.00%> (+<0.01%) ⬆️
OS-Windows 98.42% <100.00%> (+<0.01%) ⬆️
OS-macOS 98.57% <100.00%> (+<0.01%) ⬆️
Py-3.10.11 98.40% <100.00%> (+<0.01%) ⬆️
Py-3.10.20 99.63% <100.00%> (+<0.01%) ⬆️
Py-3.11.15 99.63% <100.00%> (+<0.01%) ⬆️
Py-3.11.9 98.40% <100.00%> (+<0.01%) ⬆️
Py-3.12.10 98.40% <100.00%> (+<0.01%) ⬆️
Py-3.12.13 99.63% <100.00%> (+<0.01%) ⬆️
Py-3.13.12 99.68% <100.00%> (+<0.01%) ⬆️
Py-3.13.13t 99.68% <100.00%> (+<0.01%) ⬆️
Py-3.14.3 99.68% <100.00%> (+<0.01%) ⬆️
Py-3.14.4t 99.68% <100.00%> (+<0.01%) ⬆️
Py-pypy3.10.16-7.3.19 99.29% <100.00%> (+<0.01%) ⬆️
VM-macos-latest 98.57% <100.00%> (+<0.01%) ⬆️
VM-ubuntu-latest 99.70% <100.00%> (+<0.01%) ⬆️
VM-windows-latest 98.42% <100.00%> (+<0.01%) ⬆️
pytest 99.73% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@rodrigobnogueira
Copy link
Copy Markdown
Member Author

The codecov/project/typing check is failing at 97.63% against a 100% target, but this is a pre-existing issue unrelated to this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bot:chronographer:provided There is a change note present in this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant