A production-ready Next.js + Python application deployed to AWS ECS Fargate with Terraform and GitHub Actions. The Next.js frontend provides cite-mode and answer-mode search interfaces, while the Python search service handles BM25 + vector retrieval with query expansion.
graph TB
User([User]) --> ALB
subgraph AWS Cloud
subgraph VPC
subgraph Public Subnets
ALB[Application Load Balancer]
NAT[NAT Gateway]
end
subgraph Private Subnets
subgraph ECSCluster[ECS Fargate Cluster]
NextJS[Next.js Service<br/>cite-mode ยท answer-mode]
Search[Search Service<br/>BM25 + vector retrieval]
end
end
ALB --> NextJS
NextJS -->|Service Discovery| Search
NextJS --> NAT
Search --> NAT
end
subgraph Supporting Services
ECR[ECR Repositories]
CW[CloudWatch Logs]
S3[(S3<br/>TF State ยท Documents ยท Eval)]
RDS[(RDS PostgreSQL<br/>Query Logs ยท Feedback)]
end
NextJS --> RDS
NextJS --> S3
Search --> S3
NextJS -.-> CW
Search -.-> CW
ECR -.-> NextJS
ECR -.-> Search
end
.
โโโ .github/workflows/
โ โโโ deploy-qa.yml # QA deployment workflow
โ โโโ deploy-production.yml # Production deployment workflow
โ โโโ pr-check.yml # Pull request validation
โ โโโ destroy.yml # Infrastructure teardown
โโโ docs/plans/ # Design & implementation docs
โโโ evaluation/ # Retrieval & synthesis eval framework
โ โโโ run-answer-retrieval-eval.ts
โ โโโ run-answer-synthesis-capture.ts
โ โโโ run-answer-synthesis-llm-eval.ts
โ โโโ run-cite-eval.ts
โ โโโ calibrate-answer-thresholds.ts
โ โโโ calibrate-cite-thresholds.ts
โ โโโ diagnostics/ # Eval diagnostic utilities
โ โโโ lib/ # Shared eval helpers
โโโ search-service/ # Python retrieval service
โ โโโ app/
โ โโโ main.py # FastAPI entry (BM25 + vector)
โ โโโ cache_system.py # S3-backed caching
โ โโโ config.py # Service configuration
โ โโโ query_expansion.py # Query expansion logic
โ โโโ routers/ # API route handlers
โโโ src/
โ โโโ app/ # Next.js App Router
โ โ โโโ api/ # API routes
โ โ โ โโโ answer/ # Answer-mode endpoints
โ โ โ โโโ alignment/ # Alignment endpoints
โ โ โ โโโ catalog/ # Catalog endpoints
โ โ โ โโโ cite-mode-*/ # Cite-mode feedback & query logs
โ โ โ โโโ answer-mode-*/ # Answer-mode feedback & query logs
โ โ โ โโโ eval/ # Evaluation endpoints
โ โ โ โโโ health/ # Health check
โ โ โ โโโ relates/ # Related questions
โ โ โ โโโ why/ # Why endpoints
โ โ โโโ components/ # React components
โ โ โ โโโ AnswerMode/ # Answer-mode UI
โ โ โ โโโ results/ # Results display
โ โ โ โโโ Footer/
โ โ โโโ results/ # Results page (cite-mode)
โ โ โโโ utils/ # Client utilities
โ โโโ config/ # App configuration
โ โโโ db/ # TypeORM database layer
โ โ โโโ entities/ # DB entities (feedback, query logs)
โ โ โโโ queries/ # Query helpers
โ โ โโโ migrations/ # Database migrations
โ โโโ lib/ # Server-side libraries
โ โโโ llamacloud.ts # LlamaCloud integration
โ โโโ llamaindex-client.ts # LlamaIndex client
โ โโโ multi-query-strategy.ts # Multi-query retrieval
โ โโโ catalog-cache.ts # Catalog caching
โ โโโ eval-storage.ts # Eval data S3 storage
โโโ terraform/
โ โโโ backend-setup/ # Terraform state backend
โ โโโ infrastructure/ # Main infrastructure (VPC, ECS, ALB, etc.)
โ โโโ environments/ # Environment configs (qa, production)
โโโ Dockerfile # Next.js container
โโโ search-service/Dockerfile # Search service container
โโโ package.json
- Node.js 24.x or later
- Python 3.12.x or later
- Docker
- Terraform 1.0+
- AWS CLI configured with appropriate credentials
- GitHub account
git clone <repository-url>
cd askwri-app
npm installBefore deploying infrastructure, you need to create the S3 bucket and DynamoDB table for Terraform state:
cd terraform/backend-setup
# Make the script executable
chmod +x setup.sh
# Run setup (uses default values)
./setup.sh
# Or customize with environment variables
AWS_REGION=us-east-2 PROJECT_NAME=askwri-app ./setup.sh-
Create a new GitHub repository
-
Push this code to the repository
-
Create the following branches:
mainorproduction- Production deploymentsqa- QA deployments
-
Add GitHub variables for AWS permissions (Settings โ Secrets and variables โ Actions -> Variables):
OIDC_ROLE- ARN from AWS console for role GitHubActionsOIDC
The AWS credentials need permissions for:
- ECR (create/push images)
- ECS (manage clusters, services, tasks)
- EC2 (VPC, subnets, security groups, NAT gateways)
- ELB (Application Load Balancers)
- IAM (create roles and policies)
- CloudWatch (logs)
- S3 (Terraform state)
- DynamoDB (Terraform locks)
Push to the appropriate branch to trigger deployment:
# Deploy to QA
git checkout -b qa
git push origin qa
# Deploy to Production
git checkout main
git push origin main# Install dependencies
npm install
# Run development server
npm run dev
# Run tests
npm test
# Build for production
npm run build
# Start production server
npm start# Build image
docker build -t askwri-app .
# Run container
docker run -p 3000:3000 askwri-app- VPC CIDR:
10.0.0.0/16 - Resources (nextJS): 256 CPU / 512 MB Memory
- Resources (python): 1024 CPU / 4096 MB Memory
- Desired count: 1 task
- Auto-scaling: 1-2 tasks (disabled)
- VPC CIDR:
10.1.0.0/16 - Resources (nextJS): 512 CPU / 1024 MB Memory
- Resources (python): 1024 CPU / 4096 MB Memory
- Desired count: 1 tasks
- Auto-scaling: 1-10 tasks (disabled)
| Workflow | Trigger | Description |
|---|---|---|
deploy-qa.yml |
Push to qa branch |
Deploy to QA environment |
deploy-production.yml |
Push to main/production |
Deploy to Production |
pr-check.yml |
Pull requests | Run tests and validate |
destroy.yml |
Manual | Tear down infrastructure |
- Go to Actions โ Destroy Infrastructure
- Select the environment (qa or production)
- Type
DESTROYto confirm - Run workflow
cd terraform/backend-setup
chmod +x teardown.sh
./teardown.shNotes:
-
This assumes that documents.csv has already been generated and a list of documents has also been compiled.
-
The KPs are stored in AWS S3 and shared by both QA and production environments, so both environments will be affected (some parts may require service restarts).
-
Update /tmp/askWRI_docs directory with new documents.csv as well as new documents (may require some removals too)
-
rm -rf /tmp/askWRI_cache/*
-
Ensure local
search-service/.envfile contains the same contents as in AWS param store for search-service. Also good to verify root level.envcontains same contents as ASKWRI_APP_ENV contents as well. -
In search-service directory:
pip install -r requirements.txtpython -m uvicorn app.main:app --host 0.0.0.0 --port 8000- This should rebuild the cache directory (/tmp/askWRI_cache)
- Indexing time depends on the number and size of documents; see
search-service/README.mdfor up-to-date details. - When finished, the python code will output
app.main - INFO - Background indexing complete
-
Test changes by running
npm run devfrom root directory -
Run following aws s3 sync commands.
- Note: this requires you have proper AWS_PROFILE setup and have recently run
aws sso login - Note: Following sync commands do not remove files, so any file removals should be done separately, or delete everything with
aws s3 rm --recursive s3://askwri-data/documents/ aws s3 sync /tmp/askWRI_docs s3://askwri-data/documents/aws s3 rm --recursive s3://askwri-data/cache/aws s3 sync /tmp/askWRI_cache s3://askwri-data/cache/
- Note: this requires you have proper AWS_PROFILE setup and have recently run
-
Restart services (both search service and app) to pick up new files from AWS S3 (either by deploying or via AWS Console ECS service)
- CloudWatch Logs (Next.js):
/ecs/askwri-app-{environment} - CloudWatch Logs (Search Service):
/ecs/askwri-app-{environment}-search-service - Container Insights: Enabled on ECS cluster
- Health Check (Next.js):
GET /api/health - Service Discovery: Internal DNS via
{service}.askwri-app-{environment}.local
- VPC with public/private subnet isolation
- NAT Gateways for private subnet internet access
- Security groups limiting traffic
- ECS managed tags propagated to ENIs and runtime resources
- S3 bucket versioning and encryption for Terraform state
- ECR image scanning on push
- Non-root container user
- HTTPS headers configured in Next.js
- Use
FARGATE_SPOTfor non-production workloads - Auto-scaling based on CPU/Memory utilization
- ECR lifecycle policies to clean old images
- Consider reducing NAT Gateway count for non-production
- Update
terraform/environments/{env}.tfvars:
app_environment_variables = {
"MY_VAR" = "my-value"
}- Redeploy
Environment secrets for search service are stored in AWS Param Store and copied to github secrets. Be sure to update both. Param store key is
SEARCH_SERVICE_ENV in JSON format. Github secrets mirror the same key and are expected to be copy/pasted from the AWS Param Store.
Edit terraform/environments/{env}.tfvars:
container_cpu = 512 # 0.5 vCPU
container_memory = 1024 # 1 GB
desired_count = 3-
Deployment fails at ECS service stability
- Check CloudWatch logs
- Verify health check endpoint returns 200
- Check security group rules
-
Terraform state lock error
- Wait for other deployments to complete
- If stuck, manually release lock in DynamoDB
-
Docker build fails
- Ensure all dependencies are in package.json
- Check for missing files in .dockerignore
MIT