pip install -r requirements.txtThis installs:
chromadb- Local vector databasesentence-transformers- For generating embeddings
Place your Claude-processed markdown files in a directory structure like:
your-textbook/
├── 2-Processed-Chapters/
│ ├── Chapter-01-Introduction-Readable.md
│ ├── Chapter-01-Introduction-AI-Tagged.md
│ ├── Chapter-01-Introduction-Quick-Reference.md
│ ├── Chapter-02-System-Design-Readable.md
│ └── ...
└── 3-Topic-Guides/
├── Complete-Nutrients-Guide-Readable.md
└── ...
Edit akbs_ingest_markdown.py and update the main() function:
def main():
# Initialize ingester
ingester = AKBSIngester(db_path="./data/knowledge_db")
# Ingest your processed chapters
ingester.ingest_directory(
Path("./your-textbook/2-Processed-Chapters"),
source_name="RAS Textbook"
)
# Ingest your topic guides
ingester.ingest_directory(
Path("./your-textbook/3-Topic-Guides"),
source_name="RAS Textbook - Topic Guides"
)Then run:
python akbs_ingest_markdown.pyInteractive mode:
python akbs_query.pySingle query from command line:
python akbs_query.py "What is the optimal pH for lettuce?"Markdown File
↓
Extract metadata (chapter #, type, source)
↓
Extract XML tags (if AI-tagged version)
↓
Chunk by headers and paragraphs (~1000 chars each)
↓
Generate embeddings automatically
↓
Store in ChromaDB with metadata
↓
Ready for querying!
For each chunk:
- Document text - The actual content
- Metadata:
filename- Original file namechapter- Chapter number (if detected)type- readable, ai_tagged, or quick_referencesource- Name of source documenthas_tags- XML tags found (for AI-tagged files)chunk_index- Position in original documentingested_at- When it was added
When you query "What is optimal pH for lettuce?":
- Your question is converted to an embedding
- ChromaDB finds most similar document chunks
- Returns top results with metadata
- You see relevant content from your textbooks!
from pathlib import Path
from akbs_ingest_markdown import AKBSIngester
ingester = AKBSIngester()
ingester.ingest_file(
Path("Chapter-01-Introduction-Readable.md"),
source_name="My Textbook"
)ingester.ingest_directory(
Path("./processed-chapters"),
source_name="Aquaponics Bible"
)# Query the knowledge base
results = ingester.query("optimal pH for lettuce", n_results=5)
# Access results
for doc, meta in zip(results['documents'], results['metadatas']):
print(f"From: {meta['filename']}")
print(f"Content: {doc[:200]}...")
print()# In your sensor monitoring code
from akbs_ingest_markdown import AKBSIngester
kb = AKBSIngester()
# When pH reading comes in
current_ph = 6.2
current_crop = "lettuce"
# Query the knowledge base
results = kb.query(
f"optimal pH range for {current_crop}",
n_results=3
)
# Get guidance
if results['documents']:
guidance = results['documents'][0]
print(f"Knowledge Base says: {guidance}")The ingester automatically detects file types:
*-Readable.md→ type: "readable"*-AI-Tagged.md→ type: "ai_tagged" (extracts XML tags)*-Quick-Reference.md→ type: "quick_reference"- Other .md files → type: "general"
ingester = AKBSIngester()
chunks = ingester.chunk_markdown(text, max_chunk_size=500) # Smaller chunksFor AI-tagged files, tags are automatically extracted:
xml_tags = ingester.extract_xml_tags(content)
# Returns: {'parameter': [...], 'value': [...], 'optimal': [...]}print(f"Total documents: {ingester.collection.count()}")- Make sure you've run the ingestion script first
- Check that your file paths are correct
- Try broader queries
- Make sure relevant content was ingested
- Check that database path is correct
- Make sure
./data/knowledge_dbdirectory is writable - Try deleting and recreating the database
Copy these files to your Pi:
scp akbs_ingest_markdown.py pi@raspberrypi:/home/pi/aquaponics/
scp requirements.txt pi@raspberrypi:/home/pi/aquaponics/
scp -r data/knowledge_db pi@raspberrypi:/home/pi/aquaponics/data/Then in your sensor code:
from akbs_ingest_markdown import AKBSIngester
kb = AKBSIngester(db_path="/home/pi/aquaponics/data/knowledge_db")
guidance = kb.query("pH is 6.2, what should I do?")# Get learning content
results = kb.query("explain nitrogen cycle", n_results=10)
# Extract readable content
lesson_content = "\n\n".join(results['documents'])# Get parameters for modeling
params = kb.query("tilapia growth rates at 75 degrees")- ✅ Ingest your Claude-processed files
- ✅ Test queries with
akbs_query.py - Build Python API wrapper (optional)
- Connect to sensor system
- Add more documents over time
By default, the knowledge base is stored at:
./data/knowledge_db/
This directory contains:
- ChromaDB index files
- Embeddings
- Metadata
Important: This directory is PORTABLE! You can:
- Copy it to other machines
- Back it up
- Version control it (except it might be large)
- Share it with others
- First query may be slow (loading models)
- Subsequent queries are fast (<1 second)
- Ingestion speed: ~50-100 documents/minute
- Database size: ~10-50 MB per textbook (depending on size)
You now have a queryable, persistent knowledge base from your Claude-processed textbooks! 🎉