This document covers the technical architecture, implementation details, and development guidelines for the Pokedex Field Log Generator.
The application is built on Next.js 16 with a job-based processing architecture that handles long-running AI operations through a background job runner. All data is stored locally in SQLite with WAL mode for concurrent access.
Frontend (Client-Side)
- React 19 components with TypeScript
- Service layer for API communication
- Real-time job progress via Server-Sent Events (SSE)
- Local state management
Backend (Server-Side)
- Next.js API routes
- Background job runner with cooldown management
- Gemini AI integration (text generation and TTS)
- SQLite database with better-sqlite3
src/
├── app/
│ ├── api/
│ │ ├── audio/ # Audio log CRUD operations
│ │ ├── jobs/ # Job management + maintenance endpoints
│ │ ├── pokemon/ # Pokemon data caching + thumbnails
│ │ ├── prompts/ # Prompt customization
│ │ └── summaries/ # Summary CRUD operations
│ ├── admin/ # Admin page
│ ├── generator/ # Generator page
│ ├── library/ # Library page
│ ├── page.tsx # Main application interface
│ └── layout.tsx # Root layout
├── components/
│ ├── AdminView.tsx # Admin panel component
│ ├── GenerationView.tsx # Pokemon selection + generation UI
│ ├── Header.tsx # Application header
│ ├── HomeView.tsx # Landing page component
│ ├── LibraryView.tsx # Summary/audio library browser
│ ├── PokedexLibraryView.tsx # Pokedex-style library view
│ ├── ProcessingOverlay.tsx # Job progress overlay
│ ├── ResultsView.tsx # Generation results display
│ ├── ThemeProvider.tsx # Theme context provider
│ └── ToastProvider.tsx # Toast notification system
├── hooks/
│ ├── useJobStream.ts # Real-time job progress via SSE (EventSource)
│ ├── usePokemonData.ts # Pokemon data fetching + caching
│ └── useSavedData.ts # Saved summaries/audio state
├── services/
│ ├── jobsService.ts # Job management API client
│ ├── pokeService.ts # Pokemon data API client + variant detection
│ ├── promptService.ts # Prompt API client + defaults
│ ├── storageService.ts # Summary + audio log API client
│ ├── audioSplitter.ts # Client-side audio splitting
│ ├── audioSplitterNode.ts # Node-compatible audio splitting
│ └── audioUtils.ts # Audio playback utilities
├── lib/
│ ├── db/
│ │ ├── adapter.ts # Database adapter interface + types
│ │ ├── sqlite.ts # SQLite database adapter
│ │ └── mysql.ts # MySQL database adapter (placeholder)
│ └── server/
│ ├── jobRunner.ts # Background job processor
│ ├── jobEvents.ts # SSE event emitter (globalThis singleton)
│ ├── gemini.ts # Gemini AI client (text + TTS)
│ ├── pokemon.ts # Server-side Pokemon data fetching
│ ├── audioConverter.ts # PCM to MP3 conversion via ffmpeg
│ ├── config.ts # Server-side configuration constants
│ ├── api.ts # Standardized API response utilities
│ └── prompts.ts # Server-side prompt retrieval
├── utils/
│ └── pokemonUtils.ts # Pokemon display formatting utilities
├── types.ts # TypeScript type definitions
├── constants.ts # Application constants (voices, flavor text)
└── instrumentation.ts # Next.js instrumentation (starts job runner)
The application uses SQLite with five main tables:
Stores fetched Pokemon data from PokeAPI to minimize API calls.
CREATE TABLE pokemon_cache (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
height INTEGER NOT NULL,
weight INTEGER NOT NULL,
types TEXT NOT NULL, -- JSON array
habitat TEXT NOT NULL,
flavor_texts TEXT NOT NULL, -- JSON array
move_names TEXT NOT NULL, -- JSON array
image_png_path TEXT,
image_svg_path TEXT,
generation_id INTEGER NOT NULL,
region TEXT NOT NULL,
display_name TEXT, -- Formatted display name
species_id INTEGER, -- Base species ID
is_default INTEGER, -- 1 if default form
form_name TEXT, -- Variant form name
variant_category TEXT, -- 'default' | 'mega' | 'regional' | 'gmax' | 'other'
region_name TEXT, -- Region name for regional variants
cached_at TEXT NOT NULL
);Stores generated field log narratives.
CREATE TABLE summaries (
id INTEGER PRIMARY KEY, -- Pokemon ID
name TEXT NOT NULL,
summary TEXT NOT NULL,
region TEXT NOT NULL,
generation_id INTEGER NOT NULL,
created_at TEXT NOT NULL,
updated_at TEXT NOT NULL
);Stores generated audio narrations.
CREATE TABLE audio_logs (
id INTEGER PRIMARY KEY, -- Pokemon ID
name TEXT NOT NULL,
region TEXT NOT NULL,
generation_id INTEGER NOT NULL,
voice TEXT NOT NULL, -- Voice profile (Kore, Zephyr, etc.)
audio_base64 TEXT NOT NULL, -- Base64-encoded audio data
audio_format TEXT NOT NULL, -- "mp3"
bitrate INTEGER NOT NULL, -- MP3 bitrate in kbps (default: 128)
created_at TEXT NOT NULL,
updated_at TEXT NOT NULL
);Stores custom prompt overrides.
CREATE TABLE prompts (
type TEXT PRIMARY KEY, -- 'summary' or 'tts'
content TEXT NOT NULL,
created_at TEXT NOT NULL,
updated_at TEXT NOT NULL
);Tracks background processing jobs.
CREATE TABLE jobs (
id TEXT PRIMARY KEY, -- UUID
status TEXT NOT NULL, -- 'queued' | 'running' | 'paused' | 'completed' | 'failed' | 'canceled'
stage TEXT NOT NULL, -- 'summary' | 'audio'
mode TEXT NOT NULL, -- 'FULL' | 'SUMMARY_ONLY' | 'AUDIO_ONLY'
generation_id INTEGER NOT NULL,
region TEXT NOT NULL,
voice TEXT NOT NULL,
total INTEGER NOT NULL,
current INTEGER NOT NULL,
message TEXT NOT NULL,
cooldown_until TEXT, -- ISO timestamp for rate-limit cooldown
error TEXT,
retry_count INTEGER DEFAULT 0, -- Number of retry attempts
pokemon_ids TEXT NOT NULL, -- JSON array of Pokemon IDs to process
created_at TEXT NOT NULL,
updated_at TEXT NOT NULL
);The job-based architecture handles long-running AI operations without blocking the UI. Real-time progress is delivered to clients via Server-Sent Events (SSE).
- Creation — Client creates job via
POST /api/jobs - Queuing — Job enters queue with status
queued - SSE Connect — Client opens
EventSourcetoGET /api/jobs/{id}/stream - Processing — Job runner picks up job, sets status to
running, emits progress events - Completion — Job finishes with a terminal SSE event (
completed,failed, orcanceled)
The application uses Server-Sent Events instead of HTTP polling for real-time job progress. This eliminates frequent database reads/writes and provides instant UI updates.
Architecture:
jobRunner → jobEvents.emit() → SSE endpoint → EventSource → useJobStream → React UI
↑
REST (pause/cancel/resume) ──────────┘
Key components:
lib/server/jobEvents.ts—globalThis-basedJobEventEmittersingleton shared across Next.js module re-evaluations and HMRapi/jobs/[id]/stream/route.ts— SSE endpoint usingReadableStream. Sends initial state on connect, subscribes to live events, 30-second keepalive commentshooks/useJobStream.ts— Client hook using browserEventSourceAPI with auto-reconnect
SSE Event Types:
progress— Stage, current/total counts, message, cooldown timestampcompleted— Job finished successfully (includes generationId, pokemonIds, mode)failed— Job failed with error message and partial result metadatacanceled— Job was canceled by userpaused/resumed— Job pause state changed
Design decisions:
- DB writes still happen alongside SSE events for crash recovery persistence
- REST endpoints (pause/cancel/resume) also emit SSE events for instant UI feedback
- No heartbeat mechanism — SSE keepalive comments prevent proxy/browser timeouts
sleepWithJobControlonly reads DB for pause/cancel status checks (no writes)
The background job runner (lib/server/jobRunner.ts) polls for queued jobs every second and processes them with stage-aware concurrency control.
Key Features:
- Stage-aware job claiming (only claims jobs matching available capacity)
- Concurrency limits: 3 concurrent summary jobs, 1 concurrent audio job
- Automatic cooldown management with jitter between API calls
- Pause/resume/cancel support with atomic state transitions
- Error handling and retry logic with exponential backoff
- User-friendly error formatting for API errors displayed on the results page
- SSE event emission at every progress point and terminal state
Cooldown Periods:
- Summary generation: 15 seconds between Pokemon (±20% jitter)
- TTS generation: 15 seconds between Pokemon (±20% jitter)
Jobs can be controlled through REST API endpoints. Each endpoint also emits an SSE event for instant client notification:
POST /api/jobs/{id}/pause— Pause a running jobPOST /api/jobs/{id}/resume— Resume a paused jobPOST /api/jobs/{id}/cancel— Cancel a job
The application uses Google's Gemini AI for both text generation and text-to-speech.
Model: gemini-3-flash-preview
Configuration:
- Temperature: 0.85
- Structured JSON output via response schema
- Retry with exponential backoff (up to 4 retries)
Prompt Structure:
[System Instructions]
You are a field researcher documenting Pokemon encounters...
[Pokemon Context]
ID: {id}
Name: {name}
Region: {region}
Types: {types}
Physicals: {height}m, {weight}kg
Habitat: {habitat}
Lore Context: {flavor_texts}
Available Moves: {moves}
Primary Model: gemini-2.5-pro-preview-tts (50 RPD) - Requires paid API key Fallback Model: gemini-2.5-flash-preview-tts (10 RPD) - Only 10 RPD available on free tier
API Key Requirements:
- Free Tier: Only
gemini-2.5-flash-preview-ttsis available - Paid Tier: Required for
gemini-2.5-pro-preview-ttsaccess - Note: As of March 2026, the Pro TTS model remains in preview but requires billing setup
Configuration:
- Output: PCM 16-bit signed little-endian at 24000 Hz, converted to MP3 (128 kbps) via ffmpeg
- Voice profiles: Kore, Zephyr, Charon, Puck, Fenrir
- Strategy: Pro-first with Flash fallback. Max 4 API calls per Pokemon (1+1 retry on Pro, 1+1 retry on Flash)
- Daily quota exhaustion triggers immediate fallback (no retries)
Batch-Level Quota Tracking:
Within a single batch, once a model's daily quota is exhausted it is skipped for all remaining items. This avoids wasting API calls on a model known to be maxed out.
resetBatchQuotaState()is called at the start of every new audio batch- If Pro is exhausted mid-batch, all remaining items use Flash directly
- If both Pro and Flash are exhausted,
TtsQuotaExhaustedErroris thrown - The job runner catches this error, saves partial progress, and displays a user-friendly message on the results page
- Quotas reset at midnight Pacific Time
Director's Notes: The TTS prompt includes detailed director's notes for voice styling:
- Style: Nature documentary narration
- Tone: Serene, melodic, intimate
- Delivery: Flat, authoritative cadence
- Pacing: Slow, deliberate, measured
- Client requests Pokemon data via
GET /api/pokemon/{id} - Server checks cache in database
- If not cached:
- Fetch from PokeAPI (
/api/v2/pokemon/{id}) - Fetch species data (
/api/v2/pokemon-species/{id}) - Download and save sprite images
- Store in database cache
- Fetch from PokeAPI (
- Return processed data to client
- Client creates job with mode
FULLorSUMMARY_ONLY - Job runner fetches Pokemon data
- Constructs prompt with Pokemon context
- Calls Gemini API for text generation
- Saves summary to database
- Updates job progress
- Client creates job with mode
FULLorAUDIO_ONLY - Job runner resets batch quota state and fetches existing summary for each Pokemon
- Constructs TTS prompt with director's notes
- Calls Gemini TTS API (one call per Pokemon, Pro-first with Flash fallback, batch-level quota tracking)
- Converts PCM response to MP3 via ffmpeg
- Saves audio to database as base64-encoded MP3
- Enforces 15-second cooldown between Pokemon
- Emits SSE progress events in real-time
- On completion or failure, emits terminal SSE event with result metadata
When a job fails, the system:
- Converts raw API error messages into user-friendly descriptions via
formatUserFriendlyError() - Stores the friendly error in the database
- Emits a
failedSSE event with the error message, generationId, pokemonIds, and mode - The client builds partial results from whatever was successfully generated
- Redirects to the
ResultsViewwith an error banner and any partial results
Recognized error patterns:
- Daily API quota exhaustion (429 + PerDay indicators)
- Per-minute rate limits (429 + PerMinute indicators)
- Service overload (503)
- Internal server errors (500)
- Missing API key configuration
- TTS quota exhaustion across both models
GEMINI_API_KEY=your_gemini_api_key_hereDB_TYPE=sqlite # Database type (sqlite or mysql)# Install dependencies
pnpm install
# Create environment file
cp .env.example .env.local
# Add your Gemini API key
echo "GEMINI_API_KEY=your_key_here" >> .env.local
# Run development server
pnpm devpnpm dev # Start development server
pnpm build # Build for production
pnpm start # Start production server
pnpm lint # Run ESLint
pnpm lint:fix # Fix ESLint issues
pnpm type-check # Run TypeScript type checking
pnpm format # Format code with Prettier
pnpm format:check # Check code formatting
pnpm check # Run all checks (type-check, lint, format)
pnpm fix # Fix all auto-fixable issuesThe project uses:
- ESLint - Code linting with Next.js config
- Prettier - Code formatting with Tailwind CSS plugin
- TypeScript - Strict type checking
- Pokemon data cached indefinitely in database
- Sprite images saved to
/public/pokemon/directory - No external API calls for cached data
- Summary generation: 15-second cooldown between Pokemon (±20% jitter)
- TTS generation: 15-second cooldown between Pokemon (±20% jitter)
- Concurrency: Up to 3 summary jobs, 1 audio job running simultaneously
- Cooldowns enforced server-side in job runner
- SQLite WAL mode for concurrent reads
- Indexed primary keys for fast lookups
- JSON columns for array data storage
- Gemini API key stored in
.env.local(server-side only) - Never exposed to client-side code
- All AI operations performed server-side
- Pokemon IDs validated against known ranges
- Job parameters validated before processing
- SQL injection protection via parameterized queries
pnpm build
pnpm startEnsure .env.local contains your Gemini API key in production.
The SQLite database file (pokemon_data.db) is created automatically on first run. Ensure the application has write permissions to the project directory.
Job stuck in "running" state:
- Check job runner is active
- Verify Gemini API key is valid
- Check for rate limit errors in logs
Missing Pokemon images:
- Ensure
/public/pokemon/directory is writable - Check PokeAPI availability
- Verify sprite URLs are accessible
Audio playback issues:
- Confirm browser supports MP3 format
- Check audio data is properly base64 encoded
- Verify ffmpeg-static is installed correctly
Potential areas for expansion:
- MySQL database adapter for multi-user deployments
- Custom voice profile training
- Advanced prompt engineering interface
- Batch export to multiple formats
- Integration with additional Pokemon data sources