Summary
Arc's user-SQL validator (internal/api/query.go:ValidateSQLRequest) blocked only read_parquet( and arc_partition_agg( via regex denylist. The broader DuckDB I/O function family — read_csv_auto, read_csv, read_json, read_json_auto, read_text, read_blob, glob, parquet_metadata, parquet_schema, read_xlsx, etc. — was not blocked. RBAC table-reference extraction inspected only FROM/JOIN clauses, so scalar table functions in the SELECT list slipped past both layers.
Impact
Any authenticated user, including a token with permissions: [], can read arbitrary local files via:
POST /api/v1/query
Authorization: Bearer <token>
{"sql": "SELECT * FROM read_csv_auto('/etc/passwd', header=false, columns={'l':'VARCHAR'}) LIMIT 5"}
Confirmed reachable targets:
auth.db — bcrypt hashes for every API token, plus legacy SHA-256 rows.
arc.toml — S3 secrets, TLS keys.
/proc/self/environ — environment-variable secrets.
- Cross-tenant Parquet files — bypasses RBAC because the tenant scope is enforced at the table layer, not on raw file paths.
- SSRF when
httpfs is loaded (any S3-backed deployment) — read_csv_auto('http://169.254.169.254/latest/meta-data/...') reaches instance metadata IPs.
Patches
Fixed in 2026.06.1 (PR #442) via a structural sandbox at the DuckDB layer:
SET GLOBAL allowed_directories = [...] enumerates Arc's legitimate filesystem prefixes (storage roots + tier prefixes + import upload dir + compaction temp).
SET GLOBAL enable_external_access = false (one-way at runtime).
- Verified by reading back the flag.
After lockdown, DuckDB refuses to open any file outside the allowlist and refuses further INSTALL/LOAD. Already-loaded extensions remain callable.
Workarounds
- Restrict API access to known-trusted networks via firewall rules.
- Temporary mitigation: add
read_csv*/read_json*/glob etc. to dangerousSQLPattern in internal/api/query.go pending 2026.06.1.
Credits
Reported by Alex Manson (@NeuroWinter, https://neurowinter.com/) on 2026-05-19.
References
Summary
Arc's user-SQL validator (
internal/api/query.go:ValidateSQLRequest) blocked onlyread_parquet(andarc_partition_agg(via regex denylist. The broader DuckDB I/O function family —read_csv_auto,read_csv,read_json,read_json_auto,read_text,read_blob,glob,parquet_metadata,parquet_schema,read_xlsx, etc. — was not blocked. RBAC table-reference extraction inspected onlyFROM/JOINclauses, so scalar table functions in theSELECTlist slipped past both layers.Impact
Any authenticated user, including a token with
permissions: [], can read arbitrary local files via:Confirmed reachable targets:
auth.db— bcrypt hashes for every API token, plus legacy SHA-256 rows.arc.toml— S3 secrets, TLS keys./proc/self/environ— environment-variable secrets.httpfsis loaded (any S3-backed deployment) —read_csv_auto('http://169.254.169.254/latest/meta-data/...')reaches instance metadata IPs.Patches
Fixed in 2026.06.1 (PR #442) via a structural sandbox at the DuckDB layer:
SET GLOBAL allowed_directories = [...]enumerates Arc's legitimate filesystem prefixes (storage roots + tier prefixes + import upload dir + compaction temp).SET GLOBAL enable_external_access = false(one-way at runtime).After lockdown, DuckDB refuses to open any file outside the allowlist and refuses further
INSTALL/LOAD. Already-loaded extensions remain callable.Workarounds
read_csv*/read_json*/globetc. todangerousSQLPatternininternal/api/query.gopending 2026.06.1.Credits
Reported by Alex Manson (@NeuroWinter, https://neurowinter.com/) on 2026-05-19.
References