Skip to content

PDF upload falsely rejected as "password protected" #9754

@fejesd

Description

@fejesd

Description

When uploading a PDF file via api/user/projects/file/upload, some PDFs are incorrectly rejected with the error "Document is password protected" even though they can be opened without a password.

The root cause is in backend/onyx/file_processing/password_validation.py. The is_pdf_protected() function uses pypdf's reader.is_encrypted property, which returns True for any encrypted PDF — including PDFs that only have an owner password (restricting permissions like printing or copying) but have an empty user
password. These files open freely in any PDF viewer without prompting for a password.

Reproduction

  1. Take any PDF that has permission restrictions (e.g. print/copy disabled) but no open password.
  2. Upload it via POST /api/user/projects/file/upload.
  3. The file is rejected: { "file_name": "...", "reason": "Document is password protected" }.

Such PDFs can be identified with tools like qpdf --show-encryption — they will show is_encrypted: true but user_password: "".

Affected code

backend/onyx/file_processing/password_validation.py:31-37 — is_pdf_protected()

  def is_pdf_protected(file: IO[Any]) -> bool:
      from pypdf import PdfReader
      with preserve_position(file):
          reader = PdfReader(file)
      return bool(reader.is_encrypted)  # ← false positive for owner-password-only PDFs

Recommended fix

If the PDF is encrypted, attempt to decrypt it with an empty user password. pypdf's decrypt() returns 0 on failure, 1 if the user password matched, 2 if the owner password matched. Only reject if decryption with an empty string fails (returns 0):

  def is_pdf_protected(file: IO[Any]) -> bool:
      from pypdf import PdfReader
      with preserve_position(file):
          reader = PdfReader(file)
      if not reader.is_encrypted:
          return False
      try:
          return reader.decrypt("") == 0
      except Exception:
          return True

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions