This repository provides a Dockerized setup for running ColBERT search indexing and querying using Flask. The service allows indexing and searching over text collections stored in TSV files.
Ensure you have Docker and Docker Compose installed. Then, build and start the service with:
docker-compose up --buildBefore initializing searchers, you must index the data.
Indexes documents from a TSV file into ColBERT.
<idx_name> corresponds to a file named <idx_name>.tsv located in the /data directory.
To index your files, run:
POST http://localhost:9881/api/index/<idx_name>Initializes searchers for all indexed datasets.
Must be called after /api/index/<idx_name> has been executed for the relevant dataset.
To initialize searchers, run:
POST http://localhost:9881/init_searchersPerforms a search query on the indexed dataset specified by <idx_name>.
Returns the top k results.
Dockerfile: Defines the containerized environmentdocker-compose.yaml: Manages the service dependenciesapp.py: The Flask API implementation/data: Directory where TSV files (<idx_name>.tsv) are stored for indexing/experiments: Stores indexed data/checkpoint: Stores model checkpoints
To run the ColBERT service inside Docker:
docker-compose up --buildEnsure that your data files (<idx_name>.tsv) are placed inside the /data directory before starting the indexing process.