A production-ready, modular AI companion system with multiple interaction modes, persistent memory, and extensible integrations.
- 🎤 Voice Interaction: Speech-to-text and text-to-speech for natural conversations
- 🧠 Persistent Memory: Semantic search with conversation history and context
- 🔌 Multiple LLM Backends: Support for local (Kobold) and cloud (OpenAI) models
- 🎭 Unity 3D Avatar: Visual representation with synchronized animations
- 🤖 Bot Integrations: Discord and Telegram support
- 📚 WaniKani Integration: Japanese learning context awareness
- 🔒 Security: Input validation, rate limiting, and secret management
- 📊 Performance Monitoring: Track response times and metrics
# Clone the repository
cd ai-companion
# Create virtual environment
python -m venv .venv
# Activate virtual environment
# Windows:
.venv\Scripts\activate
# Linux/Mac:
source .venv/bin/activate
# Install dependencies
pip install -r requirements.txtCopy the example environment file and configure your API keys:
copy .env.example .envEdit .env and add your API keys:
OPENAI_API_KEY: Your OpenAI API key (if using OpenAI backend)DISCORD_TOKEN: Your Discord bot token (if using Discord)TELEGRAM_TOKEN: Your Telegram bot token (if using Telegram)WANIKANI_API_KEY: Your WaniKani API key (if using WaniKani)
# Run in voice mode (default)
python main.py
# Run in text mode (for testing without microphone)
python main.py --mode text
# Use custom configuration
python main.py --config my_config.yamlpython main.py --mode voiceSpeak naturally to interact with the AI companion. The system will:
- Listen for your speech
- Convert speech to text
- Generate an AI response
- Convert response to speech
- Play the audio
python main.py --mode textType messages to interact with the AI companion. Useful for testing without a microphone.
python main.py --mode discordRuns the AI companion as a Discord bot. Configure in .env:
DISCORD_TOKEN=your-discord-bot-tokenThen enable in config/default_config.yaml:
integrations:
discord:
enabled: trueDiscord Commands:
!join- Join your voice channel!leave- Leave voice channel!talk <message>- Talk to the AI!reset- Reset conversation- Mention the bot (@BotName) to chat
python main.py --mode telegramRuns the AI companion as a Telegram bot. Configure in .env:
TELEGRAM_TOKEN=your-telegram-bot-tokenThen enable in config/default_config.yaml:
integrations:
telegram:
enabled: trueTelegram Commands:
/start- Start the bot/help- Show help/reset- Reset conversation/status- Show bot status- Just send any message to chat!
Unity mode runs alongside other modes. Enable in config/default_config.yaml:
integrations:
unity:
enabled: true
address: 127.0.0.1
port: 5005
audio_dir: Audio/Then run with voice or text mode:
python main.py --mode voice # With Unity avatarThe Unity client should listen on UDP port 5005 for animation triggers.
For Japanese learning context, enable WaniKani:
- Get API key from https://www.wanikani.com/settings/personal_access_tokens
- Add to
.env:WANIKANI_API_KEY=your-api-key
- Enable in config:
integrations: wanikani: enabled: true
The AI will know your learned vocabulary and kanji!
The application uses a YAML configuration file (config/default_config.yaml) with the following sections:
app:
mode: voice # voice, text, discord, telegram, unity
log_level: INFO
log_file: logs/companion.log
environment: developmentllm:
backend: kobold # kobold or openai
kobold_url: http://localhost:5001/api/v1/generate
temperature: 0.7
max_tokens: 300tts:
engine: piper # piper, pyttsx3, or gpt_sovits
piper_model_path: en_US-hfc_female-medium.onnx
output_dir: Audio/memory:
storage_path: data/memory.db
max_history: 20
embedding_model: sentence-transformers/all-MiniLM-L6-v2
similarity_threshold: 0.7See config/default_config.yaml for all available options.
ai-companion/
├── main.py # Unified entry point
├── config/ # Configuration management
│ ├── config_manager.py
│ ├── settings.py
│ └── default_config.yaml
├── core/ # Core components
│ ├── conversation_manager.py
│ └── memory_system.py
├── services/ # Service layer
│ ├── llm/ # LLM backends
│ ├── tts/ # TTS engines
│ ├── stt/ # Speech recognition
│ ├── audio/ # Audio processing
│ └── animation/ # Animation control
├── integrations/ # Bot integrations
│ ├── discord/
│ ├── telegram/
│ ├── wanikani/
│ └── unity/
└── utils/ # Utilities
├── validators.py
├── rate_limiter.py
└── logging_config.py
- Download KoboldCpp from releases
- Download a GGUF model (e.g., from Hugging Face)
- Start KoboldCpp:
koboldcpp.exe --model your-model.gguf --port 5001
- Configure in
config/default_config.yaml:llm: backend: kobold kobold_url: http://localhost:5001/api/v1/generate
- Get an API key from OpenAI
- Add to
.env:OPENAI_API_KEY=sk-your-key-here - Configure in
config/default_config.yaml:llm: backend: openai
- Download a Piper ONNX model from Piper releases
- Place the
.onnxfile in your project directory - Configure in
config/default_config.yaml:tts: engine: piper piper_model_path: en_US-hfc_female-medium.onnx
Uses your system's built-in TTS voices. No additional setup required.
tts:
engine: pyttsx3
voice_index: 1 # 0 for male, 1 for female
rate: 180The AI companion uses a semantic memory system that:
- Stores all conversations in a SQLite database
- Generates embeddings for semantic search
- Retrieves relevant context based on similarity
- Supports user-specific preferences and facts
Memory is stored in data/memory.db by default.
# Install development dependencies
pip install -r requirements-dev.txt
# Run tests
pytest
# Run with coverage
pytest --cov=. --cov-report=html# Format code
black .
# Lint code
pylint **/*.py
# Type checking
mypy .Install Piper TTS:
pip install piper-ttsInstall ffmpeg:
- Windows: Download from ffmpeg.org
- Linux:
sudo apt-get install ffmpeg - Mac:
brew install ffmpeg
- Check microphone permissions
- Ensure PyAudio is installed:
pip install pyaudio - Test microphone with:
python -m speech_recognition
- Ensure KoboldCpp is running
- Check the URL in configuration matches KoboldCpp's address
- Verify the model is loaded in KoboldCpp
[Your License Here]
Contributions are welcome! Please read our contributing guidelines before submitting pull requests.
For issues and questions, please open an issue on GitHub.