Conversation
2aec122 to
2841d18
Compare
292cea0 to
5685341
Compare
There was a problem hiding this comment.
Pull request overview
This PR adds document delete/update support to the core VectorDatabase (including persistence behavior), and introduces a new unified benchmark runner plus a GitHub Actions workflow to run/compare benchmarks.
Changes:
- Add
deleteDocument()andupdateDocument()toVectorDatabase, including persistence + tests and documentation. - Implement soft-delete tracking in the HNSW index and a BM25 removal API to keep text search consistent after deletes.
- Replace the legacy benchmark scripts/report generator with a new benchmark runner (
benchmark/run.php) and add CI benchmarking workflow.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
src/VectorDatabase.php |
Adds delete/update APIs, persists deleted-node state, and changes count() to active-doc count. |
src/HNSW/Index.php |
Adds soft-delete set, filters deleted nodes from search results, and persists deleted IDs via export/import state. |
src/BM25/Index.php |
Adds removeDocument() for delete/update support. |
tests/VectorDatabaseTest.php |
Adds unit tests for delete/update behavior in memory. |
tests/PersistenceTest.php |
Adds persistence tests for delete/update and doc-file removal expectations. |
README.md |
Documents the new delete/update APIs and persistence expectations. |
composer.json |
Moves PHPVector\\Benchmark\\ autoloading into production autoload (needed for --no-dev installs). |
benchmark/run.php |
New benchmark CLI entrypoint (profiles/scenarios/output formats). |
benchmark/Benchmark.php |
New benchmark implementation (insert/search/update/delete/recall/persistence). |
benchmark/ResultFormatter.php |
New Markdown + GitHub-benchmark JSON formatting. |
benchmark/Report.php |
Removes legacy report generator. |
benchmark/benchmark.php |
Removes legacy benchmark entrypoint. |
.github/workflows/benchmark.yml |
Adds benchmark workflow for PR comparison + main history storage. |
.github/workflows/test.yml |
Bumps cache/codecov GitHub Action versions. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
e8eac5a to
962b046
Compare
ezimuel
left a comment
There was a problem hiding this comment.
I left a minor comment on a CI settings for deployments.
Moreover, I would like to understand why we are using markdown as output for the benchmark. I checked the output on Actions and it doesn't seem to render markdown, see here. Maybe, we can create a comment on the PR with the benchmark output or do you have other proposals? Thanks.
962b046 to
c0dc4e5
Compare
|
@ezimuel you’re right, Markdown isn’t rendered in the Actions output. I’m now working on posting the benchmark results as a comment on the PR, with an "automatic comparison" against the base branch. |
c0dc4e5 to
01b82e4
Compare
…e-or-update-comment
b636406 to
39f2883
Compare
@ezimuel I’m just trying to understand how to run a more effective performance benchmark.
- this is a "simple"draftfeat/update-delete