Skip to content

typing: graduate opencontractserver.annotations.* (refs #1447)#1474

Open
JSv4 wants to merge 7 commits intomainfrom
mypy/graduate-annotations
Open

typing: graduate opencontractserver.annotations.* (refs #1447)#1474
JSv4 wants to merge 7 commits intomainfrom
mypy/graduate-annotations

Conversation

@JSv4
Copy link
Copy Markdown
Collaborator

@JSv4 JSv4 commented May 3, 2026

Continues draining the mypy baseline. Graduates all seven `opencontractserver.annotations.*` modules.

Removed from `mypy.ini`

  • `opencontractserver.annotations.{admin,compact_json,models,query_optimizer,utils}`
  • `opencontractserver.annotations.management.commands.{migrate_structural_annotations,populate_content_modalities}`

Added to `mypy.ini`

```
[mypy-opencontractserver.annotations.models]
disable_error_code = django-manager-missing
```

Mirrors the existing `users.models` exception — django-stubs / django-tree-queries cannot resolve the `Corpus.label_set → LabelSet.used_by_corpuses` reverse relation.

Pruned from `docs/typing/mypy_baseline.txt`

27 lines (7124 → 7097).

Per-file fixes

File Fix
`compact_json.py` Collapsed `isinstance(bounds, dict)` re-binding into a single `bounds_raw → bounds` assignment; renamed `token_indices: list[int]` to drop the redundant re-annotation.
`query_optimizer.py` Narrowed `path_query.first()` into a separate `first_path: DocumentPath | None` binding before the is-None guard so the `DocumentPath | None` value doesn't flow into the previously `DocumentPath`-typed `document_path` variable.
`models.py` Added `author_obj: AbstractBaseUser` annotation in `Note.update_content_with_revision`; scoped `# type: ignore[misc]` on three manager-override `objects = ...Manager()` declarations (`Embedding`, `Annotation`, `Note`); scoped `# type: ignore[assignment]` on the `StructuralAnnotationSet.creator = ForeignKey(..., null=True)` redeclaration (intentional override of `BaseOCModel.creator`); scoped `# type: ignore[misc]` on the lambda inside `transaction.on_commit(...)` and on `author=author_obj` passed to `NoteRevision.objects.create(...)`; dropped the `: str` re-annotation on `embedder_path = CharField(...)` (CharField field declarations don't accept Python type annotations on the LHS).
`utils.py` Typed `all_tokens: list[Any] = []` accumulator.
`populate_content_modalities.py` Typed `all_token_refs: list[Any] = []`.
`migrate_structural_annotations.py` Widened the heterogeneous `stats` dict to `dict[str, Any]` so the mixed counter / error-list values type-check.

Acceptance (per #1447)

  • `mypy.ini` block(s) removed
  • Pre-commit `mypy` runs clean on full project surface
  • No runtime behavior changes intended

```
$ docker compose -f test.yml run --rm django python -m mypy --config-file mypy.ini opencontractserver config
Success: no issues found in 1019 source files

$ pre-commit run --all-files
mypy.....................................................................Passed
```

Test plan

  • CI mypy job passes
  • CI backend tests pass

Continues draining the mypy baseline. Graduates all seven annotations
modules.

Removed from mypy.ini
- opencontractserver.annotations.{admin,compact_json,models,
  query_optimizer,utils}
- opencontractserver.annotations.management.commands.{
  migrate_structural_annotations,populate_content_modalities}

Added to mypy.ini
- [mypy-opencontractserver.annotations.models] disable_error_code =
  django-manager-missing (mirrors the existing users.models exception;
  django-stubs / django-tree-queries cannot resolve the
  Corpus.label_set -> LabelSet.used_by_corpuses reverse relation)

Pruned 27 baseline lines (7124 -> 7097).

Per-file fixes
- compact_json.py: collapsed `isinstance(bounds, dict)` re-binding into
  a single `bounds_raw -> bounds` assignment; renamed `token_indices:
  list[int]` to drop the redundant re-annotation.
- query_optimizer.py: narrowed `path_query.first()` into a separate
  `first_path: DocumentPath | None` binding before the is-None guard so
  the `DocumentPath | None` value doesn't flow into the previously
  `DocumentPath`-typed `document_path` variable.
- models.py: added `author_obj: AbstractBaseUser` annotation in
  Note.update_content_with_revision; scoped # type: ignore[misc] on
  three manager-override `objects = ...Manager()` declarations
  (Embedding, Annotation, Note); scoped # type: ignore[assignment] on
  the `StructuralAnnotationSet.creator = ForeignKey(..., null=True)`
  redeclaration (intentional override of BaseOCModel.creator); scoped
  # type: ignore[misc] on the lambda inside transaction.on_commit(...)
  and on `author=author_obj` passed to NoteRevision.objects.create();
  dropped the `: str` re-annotation on
  `embedder_path = CharField(...)`.
- utils.py: typed `all_tokens: list[Any] = []` accumulator.
- populate_content_modalities.py: typed
  `all_token_refs: list[Any] = []`.
- migrate_structural_annotations.py: widened the heterogeneous `stats`
  dict to `dict[str, Any]`.

Verification
- pre-commit run --all-files -> all hooks pass (mypy: Passed)
- docker compose -f test.yml run --rm django python -m mypy
  --config-file mypy.ini opencontractserver config -> "no issues found
  in 1019 source files"
@claude
Copy link
Copy Markdown

claude Bot commented May 3, 2026

Code Review

Summary: Clean, mechanical typing PR that graduates all seven opencontractserver.annotations.* modules from the broad ignore_errors = True baseline to proper mypy coverage. No functional regressions — one genuine correctness improvement in query_optimizer.py.


Code Quality & Style ✅

  • All seven ignore_errors = True blocks are cleanly replaced by narrow, scoped # type: ignore[error-code] suppressions. This is exactly the right direction — broad suppression → targeted suppression → eventual elimination.
  • The disable_error_code = django-manager-missing addition for annotations.models mirrors the existing users.models exception pattern precisely. Consistent.
  • The stats: dict[str, Any] widening in migrate_structural_annotations.py is accompanied by an explanatory comment, which is the correct approach for Any-widening that would otherwise look like sloppiness.
  • compact_json.py: the bounds_raw → bounds rename is cleaner than the original re-bind pattern and correctly eliminates the Any | None typing issue.

Genuine Behavioral Change (positive) 🔍

query_optimizer.py lines 403–407 — the MultipleObjectsReturned handler now adds a first_path is None guard before assigning to document_path. This is strictly defensive code: in the MultipleObjectsReturned branch, path_query.first() will never return None (there are by definition multiple objects). However, the guard is harmless and the Annotation.objects.none() fallback is safe.

Minor suggestion: a comment noting the logical impossibility would help future readers understand why this guard exists, since it looks like dead code:

except DocumentPath.MultipleObjectsReturned:
    # Shouldn't happen with constraints, but handle safely.
    # first() returns None only on an empty QS; unreachable here, but
    # needed to narrow DocumentPath | None → DocumentPath for mypy.
    first_path = path_query.first()
    if first_path is None:
        return Annotation.objects.none()
    document_path = first_path

Minor Observations

  1. author_obj: AbstractBaseUser forward declaration (models.py line 1499): The bare author_obj: AbstractBaseUser annotation before the if/else is technically redundant — mypy infers the type from the two assignment branches. It's not wrong and could help readers, but it could alternatively be dropped or replaced with a single annotated assignment on the isinstance branch:

    if isinstance(author, int):
        author_obj: AbstractBaseUser = get_user_model().objects.get(pk=author)
    else:
        author_obj = author

    Either approach is fine; the current form is acceptable.

  2. token_indices annotation removal (compact_json.py): Removing the explicit : list[int] is correct — mypy infers list[int] from usage and the explicit annotation was conflicting. No concern.

  3. list[Any] accumulators (utils.py, populate_content_modalities.py): These are appropriate for the heterogeneous structures being collected. list[Any] is preferable to leaving them unannotated given the mypy var-annotated errors.


Security & Performance

No security or performance implications — pure static analysis improvement.


Test Coverage

This PR has no runtime behavior changes (the first_path is None guard being the one edge case, which is unreachable in practice). Existing test suite coverage is sufficient. CI mypy + backend test pass confirmed in the PR description.


Verdict: Approve. The changes are correct, minimal, and follow established project conventions. The one suggestion (comment on the null guard) is non-blocking.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 3, 2026

Codecov Report

❌ Patch coverage is 73.33333% with 4 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
opencontractserver/annotations/query_optimizer.py 0.00% 4 Missing ⚠️

📢 Thoughts on this report? Let us know!

@claude
Copy link
Copy Markdown

claude Bot commented May 3, 2026

Code Review: PR #1474 — Graduate opencontractserver.annotations.* from mypy baseline

Overview

Clean, focused typing graduation PR that removes seven wholesale ignore_errors = True blocks from mypy.ini and replaces them with targeted suppressions or proper type annotations. No runtime behavior changes. Consistent with the existing users.models pattern.


Positive

  • Correctly scoped suppressions: All # type: ignore[...] comments use specific error codes (e.g., [misc], [assignment]) rather than bare # type: ignore, which is the right approach for incremental adoption.
  • Mirrors established pattern: The new disable_error_code = django-manager-missing block for annotations.models exactly mirrors the pre-existing users.models exception — good consistency.
  • Baseline line count verified: The 27 lines pruned from mypy_baseline.txt exactly correspond to the 27 errors removed from the graduated modules.
  • CHANGELOG updated with per-file detail — meets the project documentation bar.
  • No test changes required: Pure typing fix with no observable behavior difference.

Observations and Suggestions

1. author_obj: AbstractBaseUser forward declaration (models.py:1507)

author_obj: AbstractBaseUser   # declaration only, no value
if isinstance(author, int):
    author_obj = get_user_model().objects.get(pk=author)
else:
    author_obj = author

This works because the if/else is exhaustive, but mypy doesn't prove exhaustiveness here — it trusts the annotation. The # type: ignore[misc] on the downstream author=author_obj call suggests mypy still isn't fully satisfied. Consider whether cast(AbstractBaseUser, author_obj) at the use site (instead of the forward declaration + ignore) would be cleaner, but this is stylistic and the current form is acceptable.

2. token_indices = [] without annotation (compact_json.py:396)

Removing the redundant : list[int] re-annotation is correct (it was triggering [no-redef]), and mypy will infer list[int] from the appended values. No issue — just noting the inference is now implicit.

3. Unreachable guard in query_optimizer.py

first_path = path_query.first()
if first_path is None:          # comment says "unreachable"
    return Annotation.objects.none()
document_path = first_path

The pattern is correct — appeases mypy without changing behavior. The comment explains the reasoning well. Minor nit: "unreachable here" is slightly misleading since the MultipleObjectsReturned branch implies the QS is non-empty, but the intent is clear.

4. stats: dict[str, Any] (migrate_structural_annotations.py)

Using dict[str, Any] for a heterogeneous stats dict is pragmatic for a management command. A TypedDict would be more precise, but the complexity isn't warranted for an internal stats accumulator in a one-off migration command. Fine as-is.

5. Screenshot binary updates

The PR includes 70+ auto-regenerated screenshot files. Per project conventions, these are produced by the screenshots.yml CI workflow and are expected. Worth confirming these are from a routine CI re-run rather than any inadvertent visual regression in annotation-related UI — process note, not a blocking concern.


No Issues Found

  • No security implications (pure typing).
  • No N+1 queries, permission bypasses, or behavior changes introduced.
  • The EmbeddingManager, AnnotationManager, and NoteManager override suppressions ([misc]) are the correct way to handle custom Django manager declarations that mypy cannot validate without full django-stubs support.
  • The StructuralAnnotationSet.creator [assignment] suppression is appropriate for an intentional nullable override of a non-nullable base field.

Summary

Solid incremental typing improvement. The fixes are minimal, targeted, and well-explained. Consistent with the project's existing mypy graduation strategy (refs #1447). Approved with no blocking issues — the observations above are non-blocking suggestions.

github-actions Bot and others added 2 commits May 3, 2026 20:43
…ions

# Conflicts:
#	docs/assets/images/screenshots/auto/admin--pipeline-settings--secrets-modal.png
#	docs/assets/images/screenshots/auto/admin--pipeline-settings--stage-filter.png
#	docs/assets/images/screenshots/auto/annotations--pdf-canvas--with-labels.png
#	docs/assets/images/screenshots/auto/annotations--pdf-feed--with-soft-highlights.png
#	docs/assets/images/screenshots/auto/annotations--txt-document--with-spans.png
#	docs/assets/images/screenshots/auto/annotator--enhanced-label-selector--expanded.png
#	docs/assets/images/screenshots/auto/badges--celebration-modal--auto-award.png
#	docs/assets/images/screenshots/auto/badges--celebration-modal--manual-award.png
#	docs/assets/images/screenshots/auto/badges--criteria-config--with-type-selected.png
#	docs/assets/images/screenshots/auto/badges--management--create-modal.png
#	docs/assets/images/screenshots/auto/caml--editor--full-template.png
#	docs/assets/images/screenshots/auto/caml--editor--new-article.png
#	docs/assets/images/screenshots/auto/chat--approval-modal--pending.png
#	docs/assets/images/screenshots/auto/chat--context-meter--compacted.png
#	docs/assets/images/screenshots/auto/chat--context-meter--compaction-underway.png
#	docs/assets/images/screenshots/auto/chat--context-meter--green.png
#	docs/assets/images/screenshots/auto/chat--context-meter--red.png
#	docs/assets/images/screenshots/auto/chat--context-meter--yellow.png
#	docs/assets/images/screenshots/auto/chat--tool-badge--single.png
#	docs/assets/images/screenshots/auto/chat--tool-popover--multi-tool.png
#	docs/assets/images/screenshots/auto/chat--tool-popover--single-tool.png
#	docs/assets/images/screenshots/auto/community--leaderboard--filters.png
#	docs/assets/images/screenshots/auto/corpus--analytics--dashboard.png
#	docs/assets/images/screenshots/auto/corpus--bulk-import-modal--progress-step.png
#	docs/assets/images/screenshots/auto/corpus--corpus-modal--initial.png
#	docs/assets/images/screenshots/auto/corpus--description-editor--loaded.png
#	docs/assets/images/screenshots/auto/corpus--document-relationship-modal--initial.png
#	docs/assets/images/screenshots/auto/corpus--landing--clean-view.png
#	docs/assets/images/screenshots/auto/corpus--mcp-share-button--private.png
#	docs/assets/images/screenshots/auto/corpus--mcp-share-button--public.png
#	docs/assets/images/screenshots/auto/corpus--power-user--with-sidebar.png
#	docs/assets/images/screenshots/auto/corpus--toc--hybrid-index.png
#	docs/assets/images/screenshots/auto/corpus--upload-modal--initial.png
#	docs/assets/images/screenshots/auto/corpus--worker-tokens--create-modal.png
#	docs/assets/images/screenshots/auto/corpus--worker-tokens--key-display.png
#	docs/assets/images/screenshots/auto/corpus-actions--create-modal--agent-thread-existing.png
#	docs/assets/images/screenshots/auto/corpus-actions--create-modal--agent-thread-quick.png
#	docs/assets/images/screenshots/auto/corpus-actions--create-modal--initial.png
#	docs/assets/images/screenshots/auto/corpus-settings--template-picker-open.png
#	docs/assets/images/screenshots/auto/crud--modal-document--create-empty.png
#	docs/assets/images/screenshots/auto/crud--modal-document--edit-with-changes.png
#	docs/assets/images/screenshots/auto/crud--modal-document--view-readonly.png
#	docs/assets/images/screenshots/auto/crud--modal-labelset--create-empty.png
#	docs/assets/images/screenshots/auto/discussions--global--all-tabs.png
#	docs/assets/images/screenshots/auto/discussions--global--corpus-tab.png
#	docs/assets/images/screenshots/auto/discussions--global--document-tab.png
#	docs/assets/images/screenshots/auto/discussions--global--empty-state.png
#	docs/assets/images/screenshots/auto/export--config-modal--mobile.png
#	docs/assets/images/screenshots/auto/export--config-modal--rjsf-processor-form.png
#	docs/assets/images/screenshots/auto/exports--list--loading.png
#	docs/assets/images/screenshots/auto/extracts--create-column-modal--custom-model.png
#	docs/assets/images/screenshots/auto/extracts--create-column-modal--default.png
#	docs/assets/images/screenshots/auto/extracts--create-column-modal--edit-mode.png
#	docs/assets/images/screenshots/auto/extracts--create-column-modal--filled.png
#	docs/assets/images/screenshots/auto/extracts--detail-content--schema-tab.png
#	docs/assets/images/screenshots/auto/folders--create-folder-modal--initial.png
#	docs/assets/images/screenshots/auto/folders--edit-folder-modal--initial.png
#	docs/assets/images/screenshots/auto/folders--move-folder-modal--dropdown.png
#	docs/assets/images/screenshots/auto/folders--move-folder-modal--initial.png
#	docs/assets/images/screenshots/auto/icons--picker-modal--category-files.png
#	docs/assets/images/screenshots/auto/knowledge-base--docx-document--rendered.png
#	docs/assets/images/screenshots/auto/knowledge-base--index-tab--open.png
#	docs/assets/images/screenshots/auto/knowledge-base--sidebar-control-bar--sort-options.png
#	docs/assets/images/screenshots/auto/knowledge-base--sidebar-tabs--with-index.png
#	docs/assets/images/screenshots/auto/landing--call-to-action--anonymous.png
#	docs/assets/images/screenshots/auto/metadata--column-modal--initial.png
#	docs/assets/images/screenshots/auto/modals--select-documents--open.png
#	docs/assets/images/screenshots/auto/moderation--dashboard--time-filter.png
#	docs/assets/images/screenshots/auto/readme--document-annotator--with-pdf.png
#	docs/assets/images/screenshots/auto/threads--edit-message-modal--initial.png
#	docs/assets/images/screenshots/auto/threads--reply-form--nested-reply.png
#	docs/assets/images/screenshots/auto/versioning--history-panel--with-versions.png
#	docs/assets/images/screenshots/auto/widgets--chat-message--markdown.png
#	docs/assets/images/screenshots/auto/widgets--corpus-dropdown--with-options.png
#	docs/assets/images/screenshots/auto/widgets--fieldset-modal--create-empty.png
#	docs/assets/images/screenshots/auto/widgets--fieldset-modal--edit-populated.png
#	docs/assets/images/screenshots/auto/widgets--radial-button-cloud--collapsed.png
#	docs/assets/images/screenshots/auto/widgets--radial-button-cloud--confirmation-modal.png
#	docs/assets/images/screenshots/auto/widgets--radial-button-cloud--expanded.png
@claude
Copy link
Copy Markdown

claude Bot commented May 3, 2026

Code Review — PR #1474: Graduate opencontractserver.annotations.* from mypy baseline

Overview

A focused, purely mechanical typing PR: removes 7 broad ignore_errors = True blocks from mypy.ini, adds one scoped disable_error_code = django-manager-missing entry, and prunes 27 stale baseline lines. No runtime behaviour changes — confirmed by the author's mypy and pre-commit green runs.


✅ What's done well

  • Scoped suppressions over broad silencing. Every # type: ignore[misc] / # type: ignore[assignment] is narrowed to the specific error code rather than # type: ignore or ignore_errors = True. That's exactly the right pattern as the baseline drains.
  • Comments accompany every suppression. The # type: ignore[assignment] on StructuralAnnotationSet.creator is a deliberate design override; the # type: ignore[misc] on the objects = ...Manager() lines reflects a known django-stubs/CTE limitation. Each is self-documenting.
  • mypy.ini placement. The new [mypy-opencontractserver.annotations.models] block is placed logically at the bottom with the existing users.models analogue and a comment explaining the shared root cause (django-tree-queries + reverse relation). Good for consistency.
  • compact_json.py cleanup. Collapsing bounds_raw / isinstance / fallback into a single expression is cleaner and removes the temporary Any | None binding that confused mypy. Dropping the redundant : list[int] re-annotation on token_indices is correct — Python type annotations on an already-inferred local are just noise.
  • query_optimizer.py guard. The first_path is None branch is unreachable when entering via MultipleObjectsReturned, and the comment says so explicitly. This is the right trade-off — the alternative would be cast(), which is noisier and hides the intent.

Questions / minor concerns

1. admin.py graduated with no code changes visible in the diff

The baseline prunes nine admin.py error lines and the [mypy-opencontractserver.annotations.admin] block is removed, but there's no corresponding admin.py hunk. The errors there were the classic Callable has no attribute 'short_description' / 'admin_order_field' pattern. Were those fixed in a prior commit that this branch is based on, or is something in the mypy/django-stubs environment silently resolving them? A brief note in the PR description (or here) confirming how admin.py went clean would close that loop.

2. list[Any] annotations in utils.py and populate_content_modalities.py

all_tokens: list[Any] = []
all_token_refs: list[Any] = []

Any works and unblocks mypy, but both lists accumulate token-reference dicts. If the element shape is stable enough to describe (even a TypedDict or dict[str, Any]), a tighter annotation would carry more information to the next reader. If the contents genuinely vary (v1 vs v2 format divergence), list[Any] is fine and no change needed — just worth a look.

3. # type: ignore[misc] on author=author_obj in NoteRevision.objects.create()

The suppression is one-liner with no adjacent comment explaining the root cause. The author_obj: AbstractBaseUser declaration before the if/else is correct for narrowing, but mypy still complains about passing it to .create(). Is this a create() signature issue with the custom manager, or an AbstractBaseUser vs User incompatibility? A one-line comment (e.g. # django-stubs: create() infers AUTH_USER_MODEL exactly, not AbstractBaseUser) would match the standard set by the other suppressions in this file.


Summary

The PR is clean, minimal, and mechanically sound. The two actionable items are:

  1. Clarify how admin.py went clean without a code change.
  2. Add a comment to the author=author_obj # type: ignore[misc] line explaining the suppression reason (for consistency with the others).

Neither is a blocker. Happy to approve once #1 is explained.

@JSv4
Copy link
Copy Markdown
Collaborator Author

JSv4 commented May 4, 2026

Thanks for the review. Addressing the three items:

1. How admin.py went clean without a code change in this PRopencontractserver/annotations/admin.py was already made type-clean in PR #1355 ("Add return-type annotations to core models and import/export pipeline", commit 36921acbc). That PR added from __future__ import annotations, return-type annotations on lookups / queryset / dimension / total_embeddings / modality_display / get_queryset, and # type: ignore[attr-defined] on the short_description / admin_order_field callable attributes — exactly the pattern you described. Per #1355's own commit message, mypy graduation was deliberately deferred to a follow-up. This PR is that follow-up: the file already passes mypy on its own, so removing the [mypy-opencontractserver.annotations.admin] block from mypy.ini and the nine baseline lines is purely a cleanup.

2. list[Any]list[dict[str, int]] — addressed in commit a847bcb43. The accumulator shape is fixed at {"pageIndex": int, "tokenIndex": int}, so the tighter annotation carries real information. compute_content_modalities accepts list[dict[str, Any]], which Any's bidirectional consistency lets dict[str, int] flow into without complaint. Also dropped a now-unused Any import in populate_content_modalities.py.

3. Comment on author=author_obj # type: ignore[misc] — addressed in the same commit. Added a comment explaining that django-stubs' create() infers AUTH_USER_MODEL exactly (not the abstract AbstractBaseUser base), even though author_obj is a runtime-correct subtype.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant