Skip to content

[Feature]: Standardize indexing content to include comments for both Issues and PRs #47

@Kavirubc

Description

@Kavirubc

Problem Statement

There is a discrepancy between the CLI backfill logic and the live bot pipeline:

  • CLI Backfill: Indexes Title + Body + Comments.
  • Live Bot: Indexes Title + Body ONLY.

This leads to inconsistent search results depending on how an item was indexed. An issue backfilled via CLI will have richer context than one processed by the bot in real-time.

Proposed Solution

  1. Create a shared function (e.g., in internal/utils/text) to generate the content string for embedding.
  2. Refactor internal/steps/indexer.go (Bot) to use this function.
  3. Refactor cmd/simili/commands/index.go (CLI) to use this function.
  4. The shared function MUST include comments for both Issues and PRs to ensure rich context for similarity search.

Feature Scope

  • Pipeline Steps
  • CLI

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requesturgentCritical issues that need immediate attentionv0.2.0Target for v0.2.0

Projects

Status

Done

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions