Skip to content

find_organism_by_name docstring says it raises ValueError on multiple matches; it prints and picks first #542

@turbomam

Description

@turbomam

Context

Surfaced in Copilot's 2026-04-16 review of #531.

kg_microbe/query_utils/organism_queries.py::find_organism_by_name has a docstring stating it raises ValueError when multiple organisms match. The implementation actually:

  • prints the candidate list to stdout, and
  • silently returns the first match.

This is a contract mismatch. Programmatic callers can neither rely on the documented exception nor inspect the alternatives — and print() in a library function is noisy for CLI consumers that redirect stdout.

Suggested fix — Option A (align with docstring)

candidates = ", ".join(f"{row[1]} ({row[0]})" for row in result[:5])
if len(result) > 5:
    candidates += ", ..."
raise ValueError(
    f"Multiple organisms found matching '{name}': {candidates}. "
    "Please refine the search term."
)

Suggested fix — Option B (keep silent-first-match)

If the current behavior is intentional, change the return shape to expose alternatives (e.g., return a named tuple (selected, alternatives)) and replace print() with logging.info / logging.warning.

File involved

  • kg_microbe/query_utils/organism_queries.py (find_organism_by_name)

References

  • PR #531
  • Copilot review at commit 1de973d, 2026-04-16T23:15Z

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions