Skip to content

Benchmark #5

Merged
ezimuel merged 16 commits intoezimuel:mainfrom
danielebarbaro:feat/update-delete-benchmark-refactor
Apr 9, 2026
Merged

Benchmark #5
ezimuel merged 16 commits intoezimuel:mainfrom
danielebarbaro:feat/update-delete-benchmark-refactor

Conversation

@danielebarbaro
Copy link
Copy Markdown
Collaborator

@danielebarbaro danielebarbaro commented Mar 28, 2026

@ezimuel I’m just trying to understand how to run a more effective performance benchmark.

- this is a "simple"draft

  • this branch comes from feat/update-delete

@danielebarbaro danielebarbaro self-assigned this Mar 28, 2026
@danielebarbaro danielebarbaro force-pushed the feat/update-delete-benchmark-refactor branch from 2aec122 to 2841d18 Compare March 28, 2026 15:59
@danielebarbaro danielebarbaro changed the title Feat/update delete benchmark refactor [WIP] - Benchmark Mar 28, 2026
@danielebarbaro danielebarbaro force-pushed the feat/update-delete-benchmark-refactor branch 2 times, most recently from 292cea0 to 5685341 Compare March 28, 2026 16:03
@danielebarbaro danielebarbaro marked this pull request as ready for review March 31, 2026 21:50
@danielebarbaro danielebarbaro changed the title [WIP] - Benchmark Benchmark Mar 31, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds document delete/update support to the core VectorDatabase (including persistence behavior), and introduces a new unified benchmark runner plus a GitHub Actions workflow to run/compare benchmarks.

Changes:

  • Add deleteDocument() and updateDocument() to VectorDatabase, including persistence + tests and documentation.
  • Implement soft-delete tracking in the HNSW index and a BM25 removal API to keep text search consistent after deletes.
  • Replace the legacy benchmark scripts/report generator with a new benchmark runner (benchmark/run.php) and add CI benchmarking workflow.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
src/VectorDatabase.php Adds delete/update APIs, persists deleted-node state, and changes count() to active-doc count.
src/HNSW/Index.php Adds soft-delete set, filters deleted nodes from search results, and persists deleted IDs via export/import state.
src/BM25/Index.php Adds removeDocument() for delete/update support.
tests/VectorDatabaseTest.php Adds unit tests for delete/update behavior in memory.
tests/PersistenceTest.php Adds persistence tests for delete/update and doc-file removal expectations.
README.md Documents the new delete/update APIs and persistence expectations.
composer.json Moves PHPVector\\Benchmark\\ autoloading into production autoload (needed for --no-dev installs).
benchmark/run.php New benchmark CLI entrypoint (profiles/scenarios/output formats).
benchmark/Benchmark.php New benchmark implementation (insert/search/update/delete/recall/persistence).
benchmark/ResultFormatter.php New Markdown + GitHub-benchmark JSON formatting.
benchmark/Report.php Removes legacy report generator.
benchmark/benchmark.php Removes legacy benchmark entrypoint.
.github/workflows/benchmark.yml Adds benchmark workflow for PR comparison + main history storage.
.github/workflows/test.yml Bumps cache/codecov GitHub Action versions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/VectorDatabase.php Outdated
Comment thread src/HNSW/Index.php Outdated
Comment thread benchmark/Benchmark.php Outdated
Comment thread benchmark/Benchmark.php Outdated
Comment thread tests/PersistenceTest.php Outdated
Comment thread .github/workflows/benchmark.yml Outdated
Comment thread .github/workflows/benchmark.yml Outdated
Comment thread src/BM25/Index.php Outdated
@danielebarbaro danielebarbaro force-pushed the feat/update-delete-benchmark-refactor branch from e8eac5a to 962b046 Compare April 1, 2026 15:25
Copy link
Copy Markdown
Owner

@ezimuel ezimuel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a minor comment on a CI settings for deployments.
Moreover, I would like to understand why we are using markdown as output for the benchmark. I checked the output on Actions and it doesn't seem to render markdown, see here. Maybe, we can create a comment on the PR with the benchmark output or do you have other proposals? Thanks.

Comment thread .github/workflows/benchmark.yml Outdated
@danielebarbaro danielebarbaro force-pushed the feat/update-delete-benchmark-refactor branch from 962b046 to c0dc4e5 Compare April 2, 2026 14:07
@danielebarbaro
Copy link
Copy Markdown
Collaborator Author

@ezimuel you’re right, Markdown isn’t rendered in the Actions output.
I was just using it to read the results more comfortably 😇 .

I’m now working on posting the benchmark results as a comment on the PR, with an "automatic comparison" against the base branch.

@danielebarbaro danielebarbaro force-pushed the feat/update-delete-benchmark-refactor branch from c0dc4e5 to 01b82e4 Compare April 2, 2026 17:26
@danielebarbaro danielebarbaro force-pushed the feat/update-delete-benchmark-refactor branch from b636406 to 39f2883 Compare April 3, 2026 09:38
@ezimuel ezimuel merged commit 61f994f into ezimuel:main Apr 9, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants