There's a function blocklist implemented in the Go server, but an allowlist is a much safer way to limit usage. IIRC, the biggest hurdle I encountered when looking at this a few months ago was implicit function execution. For example, a function like
read_parquet('gcs://some/file/that/uses/a/gcs/secret.parquet');
isn't explicitly using the httpfs extension, but the magic gcs or s3 prefix triggers it to run. You might want to allow read_parquet for local files, but not for remote files. I was hoping that the json from json_serialize_sql would expand those use cases to show the underlying function, but it didn't seem to.
Other than that, the tree walking functionality that is already built should accommodate this easily.
cc @danielbodart
There's a function blocklist implemented in the Go server, but an allowlist is a much safer way to limit usage. IIRC, the biggest hurdle I encountered when looking at this a few months ago was implicit function execution. For example, a function like
read_parquet('gcs://some/file/that/uses/a/gcs/secret.parquet');isn't explicitly using the
httpfsextension, but the magicgcsors3prefix triggers it to run. You might want to allowread_parquetfor local files, but not for remote files. I was hoping that the json fromjson_serialize_sqlwould expand those use cases to show the underlying function, but it didn't seem to.Other than that, the tree walking functionality that is already built should accommodate this easily.
cc @danielbodart