langchain-pixeltable

LangChain VectorStore integration backed by Pixeltable -- multimodal data infrastructure with built-in embedding indexes, metadata filtering, computed column lineage, and incremental computation.

Installation

pip install langchain-pixeltable

Quick Start

Works with any LangChain Embeddings model -- cloud or local:

from langchain_pixeltable import PixeltableVectorStore
from langchain_huggingface import HuggingFaceEmbeddings  # no API key needed

vs = PixeltableVectorStore.from_texts(
    texts=[
        "Pixeltable handles multimodal data",
        "LangChain builds LLM applications",
        "Vector databases store embeddings",
    ],
    embedding=HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2"),
    metadatas=[
        {"category": "infra"},
        {"category": "framework"},
        {"category": "infra"},
    ],
    table_name="mydir.docs",
)

# Similarity search
results = vs.similarity_search("multimodal data management", k=2)
for doc in results:
    print(doc.page_content)

Filtered Similarity Search

The filter parameter maps to Pixeltable's .where() clause -- predicates are evaluated before ranking, so only matching rows participate in the similarity sort:

# Only search within "infra" documents
results = vs.similarity_search(
    "data storage", k=5, filter={"category": "infra"},
)

# With scores
results = vs.similarity_search_with_score(
    "embeddings", k=3, filter={"category": "infra"},
)
for doc, score in results:
    print(f"[{score:.3f}] {doc.page_content}")

Access the Underlying Pixeltable Table

The .table property gives direct access to the Pixeltable table for operations beyond the VectorStore interface -- computed columns, lineage, version history, and arbitrary predicates:

import pixeltable as pxt

t = vs.table

# Inspect all data
t.select(t.text, t.metadata, t.embedding).collect()

# Add a computed column -- auto-backfills all existing rows
t.add_computed_column(word_count=my_word_counter(t.text))

# New inserts via the wrapper auto-compute lineage columns
vs.add_texts(["New document"], metadatas=[{"category": "infra"}])

# WHERE on computed columns + similarity
import numpy as np
sim = t.embedding.similarity(vector=np.array(query_vec, dtype=np.float32))
results = (
    t.where(t.word_count > 5)
    .order_by(sim, asc=False)
    .limit(3)
    .select(t.text, t.word_count, sim=sim)
    .collect()
)

Connect to an Existing Pixeltable Table

Connect to any existing Pixeltable table -- including tables with multimodal columns like images or video:

vs = PixeltableVectorStore.from_existing_table(
    table_name="mydir.existing_docs",
    embedding=OpenAIEmbeddings(),
    text_column="content",
    embedding_column="content_embedding",
)
results = vs.similarity_search("search query", filter={"source": "arxiv"})

Use as a LangChain Retriever

retriever = vs.as_retriever(search_kwargs={"k": 5})
docs = retriever.invoke("What is Pixeltable?")

Why Pixeltable as a Vector Backend?

Metadata filtering via .where(): Filter on metadata fields before ranking, not post-hoc
Computed column lineage: Add derived columns that auto-backfill and auto-compute on new inserts
Persistent and versioned: Data survives restarts; every change is tracked
Incremental: Only new/changed rows get re-embedded
Multimodal native: Images, video, audio, and documents alongside text
Any embedding model: Works with OpenAI, Hugging Face, or any local model
No external services: Embedded PostgreSQL, no Docker required

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github/workflows		.github/workflows
langchain_pixeltable		langchain_pixeltable
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

langchain-pixeltable

Installation

Quick Start

Filtered Similarity Search

Access the Underlying Pixeltable Table

Connect to an Existing Pixeltable Table

Use as a LangChain Retriever

Why Pixeltable as a Vector Backend?

Links

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

langchain-pixeltable

Installation

Quick Start

Filtered Similarity Search

Access the Underlying Pixeltable Table

Connect to an Existing Pixeltable Table

Use as a LangChain Retriever

Why Pixeltable as a Vector Backend?

Links

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages