fix(machinery): limit allowed URLs#18684
Conversation
There was a problem hiding this comment.
Pull request overview
This PR hardens project-level machine translation configuration and runtime HTTP fetching to reduce SSRF risk by blocking private-network targets (unless explicitly allowed) and by restricting what remote error details can be surfaced.
Changes:
- Introduces outbound URL/hostname validation helpers and wires them into machinery form/API validation (with an allowlist via
ALLOWED_MACHINERY_DOMAINS). - Adds runtime URL validation (DNS + peer IP checks) into
http_request, including redirect handling, to prevent private-target access and DNS rebinding. - Adjusts machinery error-detail handling to avoid exposing untrusted provider response bodies; adds
trusted_error_hostsfor known providers and expands tests/docs.
Reviewed changes
Copilot reviewed 20 out of 20 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| weblate/utils/outbound.py | New outbound URL/hostname/IP validation helpers (config-time + runtime). |
| weblate/utils/requests.py | Adds URL-validation mode to http_request with redirect + peer-IP checks. |
| weblate/utils/tests/test_requests.py | Adds unit tests covering URL-validation behavior (private targets, redirects, proxy cases). |
| weblate/utils/validators.py | Adds machinery-specific URL/hostname validators using outbound guards + allowlist. |
| weblate/utils/tests/test_validators.py | Tests for the new machinery validators and allowlist behavior. |
| weblate/utils/models.py | Adds ALLOWED_MACHINERY_DOMAINS setting default. |
| weblate/machinery/forms.py | Validates endpoint fields and propagates allow_private_targets into machinery validation. |
| weblate/machinery/views.py | Passes allow_private_targets to forms; disables private targets for project-level edits. |
| weblate/machinery/models.py | Threads allow_private_targets through service configuration validation. |
| weblate/api/views.py | Enforces allow_private_targets=False for project machinery settings API. |
| weblate/api/tests.py | Adds API test ensuring private project targets are rejected. |
| weblate/machinery/base.py | Adds trusted-host logic, runtime URL validation, and revised error-detail extraction. |
| weblate/machinery/openai.py | Adds runtime base URL validation and trusted hosts for OpenAI/Azure OpenAI. |
| weblate/machinery/deepl.py | Declares DeepL trusted error hosts. |
| weblate/machinery/libretranslate.py | Declares LibreTranslate trusted error hosts. |
| weblate/machinery/anthropic.py | Declares Anthropic trusted error hosts. |
| weblate/machinery/microsoft.py | Improves Microsoft host detection for regional endpoints in error handling. |
| weblate/machinery/tests.py | Adds validation/error-detail tests and runtime URL validation coverage. |
| docs/admin/config.rst | Documents ALLOWED_MACHINERY_DOMAINS. |
| docs/changes.rst | Notes SSRF hardening and error-detail behavior change in release notes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9154c9d7fa
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
❌ 1 Tests Failed:
View the top 1 failed test(s) by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 040c2e5080
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 20 out of 20 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 20 out of 20 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 20 out of 20 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 10ccdfaeb7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7aa92e9715
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 20 out of 20 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
💡 Codex Review
weblate/weblate/utils/requests.py
Lines 153 to 156 in e92b16e
This redirect rewrite changes all non-HEAD 301 responses to GET, but the previous requests behavior only rewrites POST on 301. For project-scoped machinery calls (which now go through this path with URL validation), a 301 on a DELETE/PUT request will silently drop the original method and body, so operations like glossary deletion can be sent as GET instead of the intended mutating verb. Please mirror requests 301 semantics here (rewrite only POST) to avoid behavior regressions.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
This grew too much by trying to handle all possible situations conveniently (like keeping useful error messages for trusted endpoints). |
|
I agree 😅 ! |
|
The fix for dns-rebinding is quite complex but I guess necessary. I was wondering if adding dnsmasq (to reject private IP resolution from public DNS servers) to the weblate docker image. It would not help users setting up without the image though, unless they follow specific instructions to add it to their setups. |
|
Yes, some issues might be addressed outside Weblate, but in practice that would leave them unresolved for some part of installations. |
amCap1712
left a comment
There was a problem hiding this comment.
Overall, the changes look good to me. The code in requests.py is a bit complex and handling a lot of different cases. I haven't completely grasped it yet. But I also couldn't find better solutions to apply at weblate level.
|
The latest updates on your projects. Learn more about Argos notifications ↗︎
|
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 20 out of 20 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c1511746fc
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ) | ||
| return | ||
|
|
||
| validate_runtime_url(url, allow_private_targets=allow_private_targets) |
There was a problem hiding this comment.
Validate connected peer for runtime URL checks
validate_request_url only resolves the hostname (validate_runtime_url) for direct traffic and returns before any socket-level verification, so callers like BaseOpenAITranslation can still be routed to a private IP via DNS rebinding between validation and the actual OpenAI/Azure request. This weakens the new SSRF hardening specifically for non-http_request machinery paths, because those requests never run the post-connect peer IP check used in _validate_response_peer.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 20 out of 20 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 738c70c742
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Allow only public URLs by default to reduce risk of SSRF.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c2fcab62ad
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| def get_error_message(self, exc: Exception) -> str: | ||
| if isinstance(exc, RequestException) and exc.response and exc.response.text: | ||
| return f"{exc.__class__.__name__}: {exc}: {exc.response.text}" | ||
| return f"{exc.__class__.__name__}: {exc}" | ||
| message = f"{exc.__class__.__name__}: {exc}" | ||
| if isinstance(exc, RequestException) and exc.response: | ||
| detail = self.get_error_detail(exc.response) | ||
| if detail and detail not in str(exc): | ||
| return f"{message}: {detail}" |
There was a problem hiding this comment.
Route all error formatting through trust-filtered detail extraction
The new sanitization here only helps machinery classes that use the base implementation, but project-scoped backends can bypass it by overriding get_error_message; fresh evidence is DeepLTranslation.get_error_message (weblate/machinery/deepl.py, lines 95-110), which still returns exc.response.json()["message"] directly. In a project using a custom DeepL URL, an untrusted endpoint can therefore surface arbitrary response text to users despite this commit’s untrusted-error redaction logic. Please enforce trust-aware detail extraction at a non-overridable layer (or update overrides to call get_error_detail).
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 20 out of 20 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def test_runtime_url_validation(self, mocked_getaddrinfo) -> None: | ||
| machine = self.MACHINE_CLS(self.CONFIGURATION.copy()) | ||
| machine.delete_cache() | ||
| machine.settings["_project"] = Mock() | ||
|
|
||
| with ( | ||
| patch.object(machine.client.models, "list") as mocked_list, | ||
| self.assertRaises(ValidationError), | ||
| ): | ||
| machine.get_model() | ||
|
|
||
| mocked_getaddrinfo.assert_called_once() | ||
| mocked_list.assert_not_called() |
There was a problem hiding this comment.
This test expects validate_runtime_url() to perform local DNS validation (getaddrinfo) and raise before any client call. That behavior is skipped when an HTTP(S) proxy is detected from environment variables (validate_request_url() falls back to hostname-only validation), which can make this test fail in proxied environments. Consider patching/clearing proxy env vars (HTTPS_PROXY/HTTP_PROXY/ALL_PROXY/NO_PROXY) within the test to avoid environment-dependent behavior.
| if status_code == 302: | ||
| return True | ||
| return status_code == 301 and normalized_method == "POST" |
There was a problem hiding this comment.
_should_redirect_to_get() currently returns True for all 302 responses, which will change non-POST methods (e.g. DELETE/PUT) into GET on a 302 redirect. This diverges from requests’ redirect semantics and is inconsistent with the special-casing you added for 301; consider only switching to GET for 301/302 when the original method is POST (and for 303 for all non-HEAD methods).
| if status_code == 302: | |
| return True | |
| return status_code == 301 and normalized_method == "POST" | |
| if status_code in (301, 302) and normalized_method == "POST": | |
| return True | |
| return False |
| with self.assertRaises(ValidationError): | ||
| http_request( | ||
| "get", | ||
| "https://public.example.com/source", | ||
| validate_url=True, | ||
| allow_private_targets=False, | ||
| ) | ||
|
|
There was a problem hiding this comment.
This test assumes direct connections (no HTTP(S) proxy) so that runtime DNS/peer-IP validation runs and the request is blocked before any fetch. Since http_request() auto-detects proxies from environment variables, the test becomes environment-dependent (e.g., HTTPS_PROXY/NO_PROXY can make it treat the request as proxied and skip getaddrinfo/peer checks). Consider clearing proxy-related env vars (or explicitly forcing proxies={}) inside the test to keep it deterministic.
Allow only public URLs by default to reduce risk of SSRF.