-
Notifications
You must be signed in to change notification settings - Fork 3.1k
PDF upload falsely rejected as "password protected" #9754
Description
Description
When uploading a PDF file via api/user/projects/file/upload, some PDFs are incorrectly rejected with the error "Document is password protected" even though they can be opened without a password.
The root cause is in backend/onyx/file_processing/password_validation.py. The is_pdf_protected() function uses pypdf's reader.is_encrypted property, which returns True for any encrypted PDF — including PDFs that only have an owner password (restricting permissions like printing or copying) but have an empty user
password. These files open freely in any PDF viewer without prompting for a password.
Reproduction
- Take any PDF that has permission restrictions (e.g. print/copy disabled) but no open password.
- Upload it via POST /api/user/projects/file/upload.
- The file is rejected: { "file_name": "...", "reason": "Document is password protected" }.
Such PDFs can be identified with tools like qpdf --show-encryption — they will show is_encrypted: true but user_password: "".
Affected code
backend/onyx/file_processing/password_validation.py:31-37 — is_pdf_protected()
def is_pdf_protected(file: IO[Any]) -> bool:
from pypdf import PdfReader
with preserve_position(file):
reader = PdfReader(file)
return bool(reader.is_encrypted) # ← false positive for owner-password-only PDFs
Recommended fix
If the PDF is encrypted, attempt to decrypt it with an empty user password. pypdf's decrypt() returns 0 on failure, 1 if the user password matched, 2 if the owner password matched. Only reject if decryption with an empty string fails (returns 0):
def is_pdf_protected(file: IO[Any]) -> bool:
from pypdf import PdfReader
with preserve_position(file):
reader = PdfReader(file)
if not reader.is_encrypted:
return False
try:
return reader.decrypt("") == 0
except Exception:
return True