Date: Feb 06, 2026 Date: Feb 06, 2026 Current Status: Alpha (Debug Console Implemented)
- Transition to Django: Migrated from Chainlit/Streamlit to a robust Django backend.
- UI Redesign: Implemented a "Medical/FinTech" style 3-column workstation layout using Tailwind CSS.
- Ajax Integration: Converted the main form execution to use
fetchfor a non-blocking UI experience.
The user requested a real-time Debug Console in the UI.
**Implementation Decisions:
- Log Capture: We will configure Django's
LOGGINGto write to a filesystem.log. - Concurrency: Since the
convertview is synchronous and blocking, we rely on Django'srunserverthreading to allow a secondary/api/logs/endpoint to be polled while the main conversion runs. - UI: A floating, draggable (or fixed absolute) window toggled by a Terminal icon button.
- Ingestion Strategy: Switched
core/ingestion.pyto prioritize Firecrawl for web URLs (http*) and reserve LlamaParse for local files (.pdf), as LlamaParse failed to handle direct web URLs without a loader. - LLM URL Fix (Feb 6): Fixed
generate_structuredinengine.py. The code was appending/v1to Ollama base URL, causing LiteLLM to hit a malformed endpoint (/v1/api/generate). Removed the/v1suffix as LiteLLM handles Ollama natively. - Instructor JSON Mode (Feb 6): Switched Instructor to use
mode=instructor.Mode.JSONfor Ollama compatibility, as Tool Calling mode was causing parsing errors.
Enhanced the configuration panel with user-requested improvements:
- Added drag-and-drop area for PDF/TXT file uploads
- Backend now handles file uploads with priority: File > URL > Test mode
- Temp file handling with cleanup
- Changed from static mock to functional toggle with 4 options:
- AI Gen: Generate new images with AI
- Hybrid: Use original images + AI enhancements
- Original: Keep original images only
- Text Only: No images, text content only
- Value passed to workflow state for future vision pipeline integration
- Added 5 persona options with appropriate prompts:
- 🎒 5th Grader (Simple & Fun)
- 📚 High School Student
- 🎓 Undergraduate
- 💼 Professional/Expert
- 📊 Executive Summary
- Each has a tailored prompt in
agents/prompts.py
- Settings button added to Column 1 footer (gear icon)
- Opens modal with configuration for:
- LLM Provider: Local (Ollama), OpenAI, Anthropic
- Ollama Model: Text input (default: llama3.1:8b)
- API Key: For cloud providers
- Image Provider: ComfyUI, DALL-E, Disabled
- ComfyUI URL: For local image generation
- RAG Folder: Default
./document/ - Output Folder: Default
./document/convertit_output/
- Settings persisted to localStorage and .env file
- Backend API endpoint
/api/settings/for persistence
- Terminal icon button restored in Column 2 header
- Floating draggable console with live log polling
Files Modified:
web_ui/templates/converter/index.html- Settings modal, debug console, UI componentsstatic/js/app.js- Event handlers for dropzone, toggle, settings modalconverter/views.py- File upload handling, settings API endpointconverter/urls.py- Added settings route, fixed duplicate patternsagents/prompts.py- New persona promptsagents/workflow.py- Prompt selection logic
Implemented knowledge base indexing to improve converter output quality.
- Place PDF, TXT, or MD files in the RAG database folder
- Click "Index Now" in Settings to scan and index documents
- Indexed content is chunked and stored in ChromaDB vector store
- During conversion, relevant chunks are retrieved and injected into prompts
core/indexer.py: Document indexer service with:- File scanning and hash-based change detection
- Text chunking with overlap for better retrieval
- PDF/TXT/MD support
- Singleton pattern for efficient reuse
- "Index Now" button in Settings modal
- Updated default folders:
- RAG Database:
./document/convertit/database - Output:
./document/convertit/output
- RAG Database:
- Status feedback showing indexed/skipped/failed counts
POST /api/index/- Trigger document indexing
node_rewriteinworkflow.pynow queries RAG knowledge base- Up to 3 relevant chunks injected as "Relevant Background Knowledge"
Optimized the LLM pipeline for better accuracy and reduced remote API costs.
- Lower threshold: 15K → 6K chars for more thorough processing
- Heading-aware splitting: Preserves document structure
- Context carryover: Passes summary between chunks for coherence
- Task-based routing: Simple tasks (clean, glossary) use local LLM
- Quality-critical tasks: (rewrite, critic) use remote LLM when configured
- Auto-detection: Checks if Ollama is available before routing
core/engine.py- Addedget_model_for_task()methodagents/workflow.py- Improved chunking, added task_type to all LLM calls