Skip to content

Pull through caching with relative href urls in HTML results in 404 #842

@MichelPoppema

Description

@MichelPoppema

Version

Plugin Versions:
  common: 0.31.1
  maven: 0.4.0

from Pulp in one container (pulp/pulp)

Describe the bug

[notice] A new release of pip is available: 25.0.1 -> 25.1
[notice] To update, run: pip install --upgrade pip
ERROR: Could not install requirement fastapi from http://owsoel39418.ont.belastingdienst.nl:8080/pulp/content/my-pypi/fastapi-0.115.12-py3-none-any.whl?redirect=https://nexus.belastingdienst.nl/nexus/packages/fastapi/0.115.12/fastapi-0.115.12-py3-none-any.whl#sha256=e94613d6c05e27be7ffebdd6ea5f388112e5e430c8f7d6494a9d1d88d43e814d because of HTTP error 404 Client Error: Error while fetching from upstream remote(https://nexus.belastingdienst.nl/nexus/packages/fastapi/0.115.12/fastapi-0.115.12-py3-none-any.whl): Not Found for url: http://owsoel39418.ont.belastingdienst.nl:8080/pulp/content/my-pypi/fastapi-0.115.12-py3-none-any.whl?redirect=https://nexus.belastingdienst.nl/nexus/packages/fastapi/0.115.12/fastapi-0.115.12-py3-none-any.whl for URL http://owsoel39418.ont.belastingdienst.nl:8080/pulp/content/my-pypi/fastapi-0.115.12-py3-none-any.whl?redirect=https://nexus.belastingdienst.nl/nexus/packages/fastapi/0.115.12/fastapi-0.115.12-py3-none-any.whl#sha256=e94613d6c05e27be7ffebdd6ea5f388112e5e430c8f7d6494a9d1d88d43e814d (from http://owsoel39418.ont.belastingdienst.nl:8080/pypi/my-pypi/simple/fastapi/)

To Reproduce
https://pulpproject.org/pulp_python/docs/user/guides/sync/#synchronize-a-repository but do not use https://pypi.org/ but a Nexus simple repository

Expected behavior
pip install should install and should be cached by pulp from remote if not present already.

Additional context
pypi_simple's class method from_html() uses base_url to resolve relative href urls but does not know the actual url of the page. If the page contains a <base href="..."> meta element, this is also used.
Nexus uses relative href urls formed ../../packages/<package>/<version>/<file> but does not use a <base> element in de simple HTML.
Resolving these url's with base_url=remote.url (line 265 of pulp_python/app/pypi/views.py causes the url to be two levels short resulting in a 404. Changing this to base_url=url fixed it.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions