|
| 1 | +# FAQQer Bot - AI Coding Instructions |
| 2 | + |
| 3 | +## Project Overview |
| 4 | +A Telegram bot with three core functions: AI-powered FAQ responses, blockchain statistics monitoring, and customer service analysis for the Tari cryptocurrency project. |
| 5 | + |
| 6 | +## Architecture & Key Components |
| 7 | + |
| 8 | +### 1. Main Bot (`faqqer_bot.py`) |
| 9 | +- **Entry point** - Starts Telegram client, schedules jobs, handles commands |
| 10 | +- **FAQ System** - Multi-source FAQ loading: combines local `.txt` files and remote content from `.url` files in `faqs/` |
| 11 | +- **OpenAI Integration** - Uses GPT-4o with JSON response format, temperature 0.3 |
| 12 | +- **Periodic Refresh** - FAQ content auto-refreshes every hour via `periodic_faq_refresh()` |
| 13 | +- **Commands**: `/faq`, `/ask`, `/faqqer` (FAQ queries), `/refresh_faq`, `/analyze_support [hours] [question]`, `/version` |
| 14 | + |
| 15 | +### 2. Blockchain Stats (`blockchain_job.py`) |
| 16 | +- **Centralized Config** - THIS IS THE SINGLE SOURCE OF TRUTH for group IDs and customer analysis settings |
| 17 | +- **Important constants**: |
| 18 | + - `group_ids = [-1002281038272, -1188782007]` - Where to post blockchain stats and analysis |
| 19 | + - `ANALYSIS_CHANNELS`, `ANALYSIS_HOURS`, `CUSTOMER_SERVICE_GROUP_ID` - Customer analysis config |
| 20 | +- **Scheduled Jobs** - Posts hash rates every 3 hours via APScheduler with cron triggers |
| 21 | +- **Data Source** - Fetches from `https://textexplore.tari.com/?json` for block height and hash rates |
| 22 | + |
| 23 | +### 3. Customer Analysis (`customer_analysis_job.py`) |
| 24 | +- **Purpose** - Analyzes Telegram chat messages for customer service issues using OpenAI |
| 25 | +- **Key Feature** - Custom focused analysis: `/analyze_support [hours] <specific question>` narrows scope |
| 26 | +- **Token Management** - Truncates messages to fit ~25k token limit (keeps most recent messages) |
| 27 | +- **Language Handling** - Translates non-English messages to English for analysis |
| 28 | +- **Categories** - Pre-defined issue categories (Bridge reliability, Node setup, Wallet issues, etc.) |
| 29 | +- **Shared Client** - Requires user client (not bot) - imports `archiver_client` from `faq_archiver.py` |
| 30 | + |
| 31 | +### 4. Archiver (`faq_archiver.py`) |
| 32 | +- **Purpose** - Fetches chat history from Telegram channels for analysis |
| 33 | +- **Auth Type** - Uses **user authentication** (phone number), not bot token |
| 34 | +- **Session Files** - Creates `.session` files for persistent login |
| 35 | +- **Default Channels** - `["tariproject", "OrderOfSoon"]` |
| 36 | + |
| 37 | +## Critical Development Patterns |
| 38 | + |
| 39 | +### Configuration Management |
| 40 | +**ALL group IDs and channel settings MUST be in `blockchain_job.py`** - other modules import from there: |
| 41 | +```python |
| 42 | +from blockchain_job import ANALYSIS_CHANNELS, CUSTOMER_SERVICE_GROUP_ID |
| 43 | +``` |
| 44 | + |
| 45 | +### Telegram Client Patterns |
| 46 | +Two distinct client types exist: |
| 47 | +1. **Bot Client** (`faqqer_bot.py`) - Uses bot token, limited to bot API |
| 48 | +2. **User Client** (`faq_archiver.py`) - Requires phone number, can read full chat history |
| 49 | + |
| 50 | +### FAQ Content Loading |
| 51 | +- **Multi-source**: Scans `faqs/` for `.txt` (local) and `.url` (remote URLs) files |
| 52 | +- **HTML Detection**: Skips URLs returning HTML instead of raw text |
| 53 | +- **Combination**: Remote + local content merged, with source attribution headers |
| 54 | + |
| 55 | +### OpenAI Response Format |
| 56 | +Always use JSON mode for structured responses: |
| 57 | +```python |
| 58 | +response = client.chat.completions.create( |
| 59 | + model="gpt-4o", |
| 60 | + response_format={"type": "json_object"}, |
| 61 | + temperature=0.3, |
| 62 | + timeout=60 |
| 63 | +) |
| 64 | +``` |
| 65 | + |
| 66 | +### Job Scheduling Pattern |
| 67 | +Jobs use APScheduler with BackgroundScheduler + CronTrigger: |
| 68 | +```python |
| 69 | +scheduler.add_job(lambda: loop.create_task(post_hash_power(client)), |
| 70 | + CronTrigger.from_crontab('0 */3 * * *')) |
| 71 | +``` |
| 72 | + |
| 73 | +## Environment Variables Required |
| 74 | +```bash |
| 75 | +TELEGRAM_BOT_TOKEN # Required for bot functionality |
| 76 | +TELEGRAM_API_ID # From https://my.telegram.org/apps |
| 77 | +TELEGRAM_API_HASH # From https://my.telegram.org/apps |
| 78 | +TELEGRAM_PHONE_NUMBER # Optional - only for customer analysis (user client) |
| 79 | +OPENAI_API_KEY # For GPT-4o queries (loaded by OpenAI SDK) |
| 80 | +``` |
| 81 | + |
| 82 | +## Running & Debugging |
| 83 | + |
| 84 | +### Local Development |
| 85 | +```bash |
| 86 | +pip install -r requirements.txt |
| 87 | +python faqqer_bot.py # Starts bot with all jobs |
| 88 | +``` |
| 89 | + |
| 90 | +### Docker Deployment |
| 91 | +```bash |
| 92 | +docker build -t faqqer-bot . |
| 93 | +docker run -d --env-file .env faqqer-bot |
| 94 | +``` |
| 95 | + |
| 96 | +### Testing Individual Components |
| 97 | +```bash |
| 98 | +python blockchain_job.py # Tests hash rate fetching |
| 99 | +python faq_archiver.py # Tests message archiving |
| 100 | +``` |
| 101 | + |
| 102 | +## Common Gotchas |
| 103 | + |
| 104 | +1. **Version Updates** - Update both `FAQQER_VERSION` and `BUILD_DATE` constants in `faqqer_bot.py` |
| 105 | +2. **Group IDs** - Negative IDs indicate channels/supergroups, use `PeerChannel`; positive use `PeerChat` |
| 106 | +3. **FAQ Avoidance** - `avoidance_faq_prompt.txt` contains topics bot should refuse (e.g., "money value of XTM") |
| 107 | +4. **Session Files** - `.session` files persist user login - delete to re-authenticate |
| 108 | +5. **Token Limits** - Customer analysis truncates to ~25k tokens (100k chars) to avoid OpenAI limits |
| 109 | +6. **Hash Rates** - Manual trigger with `/faq hash rates` calls `post_hash_power()` immediately |
| 110 | + |
| 111 | +## File Organization |
| 112 | +- `faqs/` - FAQ content sources (`.txt` local, `.url` remote) |
| 113 | +- `archive/` - Archived chat history output |
| 114 | +- `media_files/` - Downloaded media from chats |
| 115 | +- `temp_analysis/` - Temporary analysis data |
| 116 | +- `*.session*` - Telethon session files (don't commit!) |
| 117 | + |
| 118 | +## Dependencies of Note |
| 119 | +- **Telethon** - Telegram client library (not python-telegram-bot!) |
| 120 | +- **OpenAI** - GPT-4o for FAQ answers and analysis |
| 121 | +- **APScheduler** - Background job scheduling |
| 122 | +- **python-dotenv** - Environment variable management |
0 commit comments