feat: google drive error resolution by evan-onyx · Pull Request #9842 · onyx-dot-app/onyx

evan-onyx · 2026-04-01T23:20:02Z

Description

Add a new interface for connectors to implement, intended to be used for individual indexing error resolution. Implemented the interface for google drive.

How Has This Been Tested?

added connector tests

Additional Options

[Optional] Please cherry-pick this PR to the latest release version.
[Optional] Override Linear Check

Summary by cubic

Adds a new Resolver interface and Google Drive error resolution to re-fetch failed documents by webViewLink using batched Drive API calls. Emits ancestor folders and can sync permissions to reduce API calls and speed up recovery.

New Features
- Added Resolver.resolve_errors(errors, include_permissions=False) to re-process failures and emit Document and HierarchyNode.
- Implemented GoogleDriveConnector.resolve_errors with batched files().get (100-item chunks), yields per-link failures when fetches fail, emits ancestors before documents, and optionally syncs permissions.
- Added get_files_by_web_view_links_batch using Drive BatchHttpRequest; skips invalid links and continues; daily tests cover single/multiple files, invalid links, empty inputs, entity-failure skips, and hierarchy validation.
Bug Fixes
- Fixed Drive field selection for files().get by deriving single-file fields from list fields, preventing missing or incorrect metadata.

^{Written for commit 043df22. Summary will update on new commits.}

github-actions · 2026-04-01T23:21:57Z

Preview Deployment

Status	Preview	Commit	Updated
✅	https://onyx-preview-8qlptlx5r-danswer.vercel.app	`dc25903`	2026-04-01 23:21:56 UTC

greptile-apps · 2026-04-01T23:23:45Z

Greptile Summary

This PR introduces a new Resolver interface in interfaces.py and implements GoogleDriveConnector.resolve_errors() to quickly re-fetch and re-index documents that previously failed, using the Drive batch API. A helper function get_files_by_web_view_links_batch (with correct single-file field extraction via _extract_single_file_fields) is added in file_retrieval.py, and a new daily integration-test suite covers the main scenarios.

Key changes:

New Resolver ABC with resolve_errors(errors, include_permissions=False) returning a generator of Document | ConnectorFailure | HierarchyNode.
GoogleDriveConnector now implements Resolver; the implementation batches files via the Drive BatchHttpRequest API, walks ancestors for hierarchy, and converts files in parallel.
get_files_by_web_view_links_batch splits requests into ≤100-item chunks and propagates per-item errors as BatchRetrievalResult.errors.
Integration tests cover single/multi-file, invalid links, empty input, entity-failure skipping, and hierarchy-node validation.

Issue found:

resolve_errors fetches files with DriveFileFieldType.WITH_PERMISSIONS when exclude_domain_link_only is True, correctly acquiring the data needed for filtering — but the actual has_link_only_permission guard that exists in both _convert_retrieved_files_to_documents and _extract_slim_docs_from_google_drive is absent here. Files with domain-link-only access would be re-indexed through this path despite being excluded in all other code paths.

Confidence Score: 4/5

Mostly safe to merge but one P1 correctness issue should be addressed first: files with domain-link-only access are not filtered in resolve_errors, causing them to be re-indexed when exclude_domain_link_only=True.
There is one P1 logic bug: the exclude_domain_link_only filter is applied in all other indexing paths but is missing from resolve_errors, leading to incorrect document re-indexing for connectors configured with that option. All other aspects of the implementation are well-structured, the batch error propagation is sound, and the test suite is thorough.
backend/onyx/connectors/google_drive/connector.py — missing exclude_domain_link_only guard in resolve_errors.

Important Files Changed

Filename	Overview
backend/onyx/connectors/interfaces.py	Adds new Resolver abstract base class with resolve_errors(); clean interface definition, no issues found.
backend/onyx/connectors/google_drive/file_retrieval.py	Adds BatchRetrievalResult, get_files_by_web_view_links_batch, and field-extraction helpers; individual batch-item errors are correctly propagated, but the public wrapper has a redundant early-exit branch (flagged previously).
backend/onyx/connectors/google_drive/connector.py	Implements Resolver.resolve_errors() for GoogleDriveConnector; missing exclude_domain_link_only guard means link-only-access files are incorrectly re-indexed when that option is enabled.
backend/tests/daily/connectors/google_drive/test_resolver.py	New integration-test suite covering single/multi-file, invalid links, empty input, entity-failure skips, and hierarchy validation; previously flagged dead-code list comprehension has been removed.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant GoogleDriveConnector
    participant get_files_by_web_view_links_batch
    participant Drive BatchHttpRequest
    participant _get_new_ancestors_for_files
    participant _convert_retrieved_file_to_document

    Caller->>GoogleDriveConnector: resolve_errors(errors, include_permissions)
    GoogleDriveConnector->>GoogleDriveConnector: extract doc_ids from errors[].failed_document
    GoogleDriveConnector->>get_files_by_web_view_links_batch: (service, doc_ids, field_type)
    loop chunks of 100
        get_files_by_web_view_links_batch->>Drive BatchHttpRequest: batch.add(files().get) per link
        Drive BatchHttpRequest-->>get_files_by_web_view_links_batch: callback(request_id, response, exception)
    end
    get_files_by_web_view_links_batch-->>GoogleDriveConnector: BatchRetrievalResult{files, errors}
    GoogleDriveConnector-->>Caller: yield ConnectorFailure for each batch error
    GoogleDriveConnector->>_get_new_ancestors_for_files: retrieved_files, permission_sync_context
    _get_new_ancestors_for_files-->>GoogleDriveConnector: ancestor HierarchyNodes
    GoogleDriveConnector-->>Caller: yield HierarchyNode (ancestors)
    GoogleDriveConnector->>_convert_retrieved_file_to_document: parallel (max_workers=8)
    _convert_retrieved_file_to_document-->>GoogleDriveConnector: Document | ConnectorFailure | None
    GoogleDriveConnector-->>Caller: yield Document or ConnectorFailure

Prompt To Fix All With AI

This is a comment left during a code review.
Path: backend/onyx/connectors/google_drive/connector.py
Line: 1719-1726

Comment:
**Missing `exclude_domain_link_only` filter before building `retrieved_files`**

Both other code paths in this connector that build a document list apply an explicit filter for `exclude_domain_link_only` before handing files off for conversion:

- `_convert_retrieved_files_to_documents` (line 1517):
  ```python
  if self.exclude_domain_link_only and has_link_only_permission(retrieved_file.drive_file):
      continue
  ```
- `_extract_slim_docs_from_google_drive` (line 1806): identical guard.

`resolve_errors` fetches files with `DriveFileFieldType.WITH_PERMISSIONS` when `self.exclude_domain_link_only` is `True` (line 1695), which means the permission data needed for the check *is* available in the response — but the check itself never runs.  As a result, when `exclude_domain_link_only=True`, files whose only access is a domain link will be re-indexed through the error-resolution path even though they are supposed to be excluded by configuration.

```python
retrieved_files = [
    RetrievedDriveFile(
        drive_file=file,
        user_email=self.primary_admin_email,
        completion_stage=DriveRetrievalStage.DONE,
    )
    for file in batch_result.files.values()
    if not (
        self.exclude_domain_link_only
        and has_link_only_permission(file)
    )
]
```

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (3): Last reviewed commit: "pr comments" | Re-trigger Greptile}

greptile-apps · 2026-04-01T23:23:48Z

backend/onyx/connectors/google_drive/file_retrieval.py

+        if exception:
+            logger.warning(f"Error retrieving file {request_id}: {exception}")
+        else:
+            results[request_id] = response


Silent batch failure swallows transient errors

When an individual batch request fails (e.g., due to a transient network/auth error, not just a permanent 404), the failure is silently dropped — only a logger.warning is emitted and no ConnectorFailure is produced. This makes transient retrieval errors indistinguishable from permanent ones (file deleted, permission revoked).

Per the interface contract, "Caller's responsibility is to delete the old ConnectorFailures and replace with the new ones." A caller who deletes all input failures matching the submitted errors list and only keeps the yielded outputs will permanently lose failure records for documents that failed due to transient batch errors. The document would disappear from both the search index and the failure tracker, with no way to retry.

Consider yielding a new ConnectorFailure (with the exception message) for any batch item that fails, rather than silently dropping it:

def callback( request_id: str, response: GoogleDriveFileType, exception: Exception | None, ) -> None: if exception: logger.warning(f"Error retrieving file {request_id}: {exception}") errors[request_id] = exception # collect errors for the caller else: results[request_id] = response

Then the public resolve_errors can yield ConnectorFailure objects for entries present in errors but absent in files.

Prompt To Fix With AI

This is a comment left during a code review. Path: backend/onyx/connectors/google_drive/file_retrieval.py Line: 579-582 Comment: **Silent batch failure swallows transient errors** When an individual batch request fails (e.g., due to a transient network/auth error, not just a permanent 404), the failure is silently dropped — only a `logger.warning` is emitted and no `ConnectorFailure` is produced. This makes transient retrieval errors indistinguishable from permanent ones (file deleted, permission revoked). Per the interface contract, "Caller's responsibility is to delete the old `ConnectorFailure`s and replace with the new ones." A caller who deletes all input failures matching the submitted `errors` list and only keeps the yielded outputs will permanently lose failure records for documents that failed due to transient batch errors. The document would disappear from both the search index and the failure tracker, with no way to retry. Consider yielding a new `ConnectorFailure` (with the exception message) for any batch item that fails, rather than silently dropping it: ```python def callback( request_id: str, response: GoogleDriveFileType, exception: Exception | None, ) -> None: if exception: logger.warning(f"Error retrieving file {request_id}: {exception}") errors[request_id] = exception # collect errors for the caller else: results[request_id] = response ``` Then the public `resolve_errors` can yield `ConnectorFailure` objects for entries present in `errors` but absent in `files`. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-04-01T23:23:50Z

backend/tests/daily/connectors/google_drive/test_resolver.py

+    google_drive_service_acct_connector_factory: Callable[..., GoogleDriveConnector],
+) -> None:
+    """Resolving an empty error list should yield nothing."""
+    connector = google_drive_service_acct_connector_factory(


Unused list comprehension result — dead code

The result of this list comprehension is discarded. The variable is never assigned, so this line has no effect on the test. If the intent was to assert something about the number of ConnectorFailures (e.g., that there are none), the assertion is missing.

Suggested change

connector = google_drive_service_acct_connector_factory(

new_failures = [r for r in results if isinstance(r, ConnectorFailure)]

Then add an assertion, e.g. assert len(new_failures) == 0.

Prompt To Fix With AI

This is a comment left during a code review. Path: backend/tests/daily/connectors/google_drive/test_resolver.py Line: 183 Comment: **Unused list comprehension result — dead code** The result of this list comprehension is discarded. The variable is never assigned, so this line has no effect on the test. If the intent was to assert something about the number of `ConnectorFailure`s (e.g., that there are none), the assertion is missing. ```suggestion new_failures = [r for r in results if isinstance(r, ConnectorFailure)] ``` Then add an assertion, e.g. `assert len(new_failures) == 0`. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-04-01T23:23:51Z

backend/onyx/connectors/google_drive/file_retrieval.py

+    fields = _get_fields_for_file_type(field_type)
+    if len(web_view_links) <= MAX_BATCH_SIZE:
+        return _get_files_by_web_view_links_batch(service, web_view_links, fields)
+
+    result: dict[str, GoogleDriveFileType] = {}
+    for i in range(0, len(web_view_links), MAX_BATCH_SIZE):
+        chunk = web_view_links[i : i + MAX_BATCH_SIZE]
+        result.update(_get_files_by_web_view_links_batch(service, chunk, fields))


Redundant early-exit branch

The if len(web_view_links) <= MAX_BATCH_SIZE guard is unnecessary. The for loop below handles that case in a single iteration, so the early return only adds code without changing behavior. Removing it simplifies the function:

Suggested change

fields = _get_fields_for_file_type(field_type)

if len(web_view_links) <= MAX_BATCH_SIZE:

return _get_files_by_web_view_links_batch(service, web_view_links, fields)

result: dict[str, GoogleDriveFileType] = {}

for i in range(0, len(web_view_links), MAX_BATCH_SIZE):

chunk = web_view_links[i : i + MAX_BATCH_SIZE]

result.update(_get_files_by_web_view_links_batch(service, chunk, fields))

result: dict[str, GoogleDriveFileType] = {}

for i in range(0, len(web_view_links), MAX_BATCH_SIZE):

chunk = web_view_links[i : i + MAX_BATCH_SIZE]

result.update(_get_files_by_web_view_links_batch(service, chunk, fields))

return result

Prompt To Fix With AI

This is a comment left during a code review. Path: backend/onyx/connectors/google_drive/file_retrieval.py Line: 554-561 Comment: **Redundant early-exit branch** The `if len(web_view_links) <= MAX_BATCH_SIZE` guard is unnecessary. The `for` loop below handles that case in a single iteration, so the early return only adds code without changing behavior. Removing it simplifies the function: ```suggestion result: dict[str, GoogleDriveFileType] = {} for i in range(0, len(web_view_links), MAX_BATCH_SIZE): chunk = web_view_links[i : i + MAX_BATCH_SIZE] result.update(_get_files_by_web_view_links_batch(service, chunk, fields)) return result ``` How can I resolve this? If you propose a fix, please make it concise.

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

cubic-dev-ai

5 issues found across 4 files

Confidence score: 2/5

There is a high-confidence data-loss/regression risk in backend/onyx/connectors/google_drive/connector.py: resolve_errors can silently drop unresolved error IDs instead of surfacing ConnectorFailure, which can cause failed items to disappear from handling.
backend/onyx/connectors/google_drive/file_retrieval.py has two related high-severity concerns (field-mask/projection mismatch and dropped items on batch exceptions) that can break batched error resolution and violate the resolver contract in user-facing failure paths.
The test gaps in backend/tests/daily/connectors/google_drive/test_resolver.py reduce safety: current assertions can miss missing hierarchy nodes and do not enforce the invalid-link ConnectorFailure behavior, making regressions easier to ship.
Pay close attention to backend/onyx/connectors/google_drive/connector.py, backend/onyx/connectors/google_drive/file_retrieval.py, backend/tests/daily/connectors/google_drive/test_resolver.py - silent failure/drop behavior and insufficient assertions around resolver error handling.

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="backend/tests/daily/connectors/google_drive/test_resolver.py">

<violation number="1" location="backend/tests/daily/connectors/google_drive/test_resolver.py:134">
P2: This test can pass even when `resolve_errors()` returns no expected hierarchy nodes, because it only validates nodes opportunistically inside the loop and never asserts that the expected IDs were present.</violation>

<violation number="2" location="backend/tests/daily/connectors/google_drive/test_resolver.py:170">
P2: This test does not verify the invalid-link behavior it describes, because `ConnectorFailure` results are ignored instead of asserted away.</violation>
</file>

<file name="backend/onyx/connectors/google_drive/connector.py">

<violation number="1" location="backend/onyx/connectors/google_drive/connector.py:1697">
P1: Unresolved error IDs are silently dropped in `resolve_errors`; emit a `ConnectorFailure` (or raise) for IDs missing from the batch response so failed items are not lost.

(Based on your team's feedback about logging or warning instead of failing silently.) [FEEDBACK_USED]</violation>
</file>

<file name="backend/onyx/connectors/google_drive/file_retrieval.py">

<violation number="1" location="backend/onyx/connectors/google_drive/file_retrieval.py:554">
P1: Use `files.get` field masks here; the current `files.list` projection (`nextPageToken, files(...)`) can break batched error resolution.</violation>

<violation number="2" location="backend/onyx/connectors/google_drive/file_retrieval.py:580">
P1: When a batch request fails (e.g., transient network/auth error), the exception is logged but the failed item is silently dropped from results. Per the `Resolver` interface contract, the caller deletes old `ConnectorFailure`s and replaces them with yielded outputs. This means documents that fail transiently will vanish from both the index and the failure tracker with no way to retry. Consider collecting batch errors and propagating them so `resolve_errors` can yield replacement `ConnectorFailure` objects for items that couldn't be retrieved.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

backend/onyx/connectors/google_drive/connector.py

backend/onyx/connectors/google_drive/file_retrieval.py

backend/tests/daily/connectors/google_drive/test_resolver.py

github-actions · 2026-04-01T23:40:59Z

🖼️ Visual Regression Report

Project	Changed	Added	Removed	Unchanged	Report
`admin`	8	0	0	158	View Report
`exclusive`	0	0	0	8	✅ No changes

acaprau · 2026-04-02T02:17:53Z

backend/onyx/connectors/interfaces.py

+    ) -> Generator[Document | ConnectorFailure | HierarchyNode, None, None]:
+        """Attempts to yield back ALL the documents described by the errors, no checkpointing.
+
+        Caller's responsibility is to delete the old ConnectorFailures and replace with the new ones.


this comment doesn't make 100% sense to me. it seems to imply you are meant to return ConnectorFailure, but the typing indicates you can return Document and HierarchyNode. i think it just might need some more detail.

u can return connector failures too!

feat: drive error resolution

dc25903

evan-onyx requested a review from a team as a code owner April 1, 2026 23:20

evan-onyx had a problem deploying to ci-protected April 1, 2026 23:20 — with GitHub Actions Failure

greptile-apps bot reviewed Apr 1, 2026

View reviewed changes

cubic-dev-ai bot reviewed Apr 1, 2026

View reviewed changes

correct file fileds

cbb676e

evan-onyx had a problem deploying to ci-protected April 2, 2026 01:11 — with GitHub Actions Failure

acaprau reviewed Apr 2, 2026

View reviewed changes

pr comments

043df22

evan-onyx had a problem deploying to ci-protected April 2, 2026 02:40 — with GitHub Actions Failure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: google drive error resolution#9842

feat: google drive error resolution#9842
evan-onyx wants to merge 3 commits intomainfrom
feat/resolve-errors-efficiency2

evan-onyx commented Apr 1, 2026 •

edited by cubic-dev-ai bot

Loading

Uh oh!

github-actions bot commented Apr 1, 2026

Uh oh!

greptile-apps bot commented Apr 1, 2026 •

edited

Loading

Uh oh!

greptile-apps bot Apr 1, 2026

Uh oh!

greptile-apps bot Apr 1, 2026

Uh oh!

greptile-apps bot Apr 1, 2026

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Apr 1, 2026 •

edited

Loading

Uh oh!

acaprau Apr 2, 2026

Uh oh!

evan-onyx Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	connector = google_drive_service_acct_connector_factory(
	new_failures = [r for r in results if isinstance(r, ConnectorFailure)]

Conversation

evan-onyx commented Apr 1, 2026 • edited by cubic-dev-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

How Has This Been Tested?

Additional Options

Summary by cubic

Uh oh!

github-actions bot commented Apr 1, 2026

Uh oh!

greptile-apps bot commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🖼️ Visual Regression Report

Uh oh!

acaprau Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

evan-onyx Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

evan-onyx commented Apr 1, 2026 •

edited by cubic-dev-ai bot

Loading

greptile-apps bot commented Apr 1, 2026 •

edited

Loading

github-actions bot commented Apr 1, 2026 •

edited

Loading