Skip to content

ivaylo-andonov/Blockchain-Indexer

Repository files navigation

Blockchain Backend Service

Production-grade backend that covers authentication, GitHub OAuth, and Ethereum smart-contract event indexing. Built with TypeScript, Express, PostgreSQL, and Drizzle ORM; packaged for containerized deployments and tested end to end.

CI


Table of Contents


At a Glance

  • Authentication: Email/password with JWT access + refresh tokens, password complexity rules, secure token rotation.
  • OAuth: GitHub OAuth 2.0 flow with resilient retry logic, account linking, and login counters.
  • Blockchain Indexer: Ethers-powered event ingestion with batching, rate-limit backoff, and persistence via Drizzle.
  • Testing: 80%+ coverage target, fully automated Docker-backed integration suites, DI-driven testability.
  • Ops: Health/readiness endpoints, graceful shutdown, structured logging, Docker-ready.

Quick Links


Feature Overview

Authentication & User Management

  • Registration, login, logout, refresh-token rotation
  • Password hashing via bcrypt with configurable cost
  • Profile updates and password change with validation
  • Comprehensive error reporting using typed error classes

GitHub OAuth

  • Full authorization-code flow with axios
  • Robust handling of missing emails, rate limiting, and timeouts
  • Automatic linking to existing local users (email match + DI-based updates)
  • Avatar and metadata synchronization with GitHub profile

Ethereum Event Indexer

  • Indexes contract events via ethers.js
  • Persists contract_events and contract_indexer_state through Drizzle
  • Supports pause/resume, block-range filtering, and pagination
  • Built-in retry/backoff logic for RPC rate limiting

Operational Excellence

  • Graceful shutdown that drains HTTP traffic, stops indexers, and releases DB pools
  • Detailed health/readiness endpoints with DB connectivity checks
  • Structured logging with Winston and environment-aware transports
  • Swagger/OpenAPI documentation auto-generated at /api-docs

Tech Stack

Concern Technology
Runtime Node.js 18, TypeScript 5
Web Framework Express 4
Database PostgreSQL 16 + Drizzle ORM
Blockchain ethers.js 6
Auth JWT (jsonwebtoken), bcrypt
Validation Zod
Testing Jest, ts-jest, Supertest (controller/service composition), Docker for DB
Tooling pnpm, eslint, prettier, husky
Packaging Docker + docker compose (dev/prod profiles)

Local Setup

  1. Install prerequisites (Node 18, pnpm 8, Docker).
  2. cp .env.example .env and populate secrets (JWT, refresh token, GitHub OAuth, Ethereum config).
  3. pnpm install
  4. Start the stack:
    • pnpm docker:dev:build && pnpm docker:dev (recommended)
    • or run migrations locally pnpm db:migrate and start via pnpm dev
  5. Confirm readiness: curl http://localhost:3000/api/v1/health
  6. Swagger UI: http://localhost:3000/api-docs

See QUICKSTART.md for the five-minute walkthrough.


Running the Application

Development (Docker-first)

pnpm docker:dev:build   # first run
pnpm docker:dev         # start backend + Postgres
pnpm docker:dev:logs    # follow backend logs

Development (local Node)

pnpm db:migrate
pnpm dev

Production Build

pnpm build
pnpm start

Docker images (dev/prod) live under docker/. See docs/DEPLOYMENT.md for cloud deployment options and hardening tips.


Testing Strategy

  • Unit tests (tests/unit) – pure logic with dependency injection and mocks.
  • Integration tests (tests/integration) – real models/services/controllers against a disposable Docker Postgres instance (blockchain-backend-test-db).
  • Global setup/teardown spins containers via tests/setupDb.ts, applies Drizzle migrations, truncates tables between cases, and tears down resources after the suite.

Commands:

pnpm test             # full suite, coverage, Docker-backed DB
pnpm test:unit        # fast unit suites
pnpm test:integration # integration suites only
pnpm test:coverage    # coverage report (lcov)

Coverage artifacts: coverage/lcov-report/index.html.

Refer to docs/TESTING_STRATEGY.md for a deep dive into the workflow, tooling, and troubleshooting.


Project Structure

src/
├── config/             # Environment + runtime configuration
├── controllers/        # Express controllers (Auth, User, Eth)
├── database/           # Drizzle schema + client management
├── models/             # Data access abstractions
├── routes/             # Express routers + Swagger docs
├── services/           # Business logic (auth, GitHub, event indexer, etc.)
├── utils/              # Logger, graceful shutdown, JWT, errors, etc.
├── validators/         # Zod schemas
└── server.ts           # Entry point (registers middlewares + routes)

tests/
├── integration/        # Controller/service flows with real DB
├── unit/               # Mocked fast tests
└── setupDb.ts          # Docker-based test DB lifecycle

docs/                   # Deployment, logging, testing, health checks, etc.
docker/                 # Dev/prod Compose files and Dockerfiles

Dependency injection is the norm—services receive models/loggers explicitly, enabling deterministic tests and future swap-outs.


Architecture Notes

  • Layered approach: Routes → Controllers → Services → Models → Drizzle.
  • Resilience: Event indexer includes exponential backoff, circuit-breaker style rate-limit detection, and dependency-injected ethers provider.
  • Graceful shutdown: GracefulShutdownHandler coordinates HTTP server close, indexer stop, and pool cleanup for SIGTERM/SIGINT/uncaught handlers.
  • Logging: Winston logger with leveled transports, enriched metadata, and environment-aware formatting.
  • Validation: Zod schemas enforce contract at the routing edge; errors propagate through central error middleware.

Additional details in docs/LOGGING.md, docs/HEALTH_CHECKS.md, and docs/DEPLOYMENT.md.


Security Posture

  • Strong password policy + bcrypt hashing
  • JWT access tokens with short TTL + refresh tokens with rotation
  • OAuth account-linking safeguards (email + GitHub ID)
  • Helmet, scoped CORS, and rate limiting
  • Structured error responses without leaking internals
  • DB access strictly via parameterised Drizzle queries

See docs/DEPLOYMENT.md for production hardening guidelines.


Environment Variables

Variable Required Default Description
NODE_ENV development Runtime environment
PORT 3000 HTTP port
DB_HOST / DB_PORT / DB_NAME / DB_USER / DB_PASSWORD PostgreSQL configuration
JWT_SECRET 32+ char access token secret
JWT_EXPIRES_IN 1h Access token TTL
REFRESH_TOKEN_SECRET 32+ char refresh secret
REFRESH_TOKEN_EXPIRES_IN 7d Refresh token TTL
GITHUB_CLIENT_ID / GITHUB_CLIENT_SECRET / GITHUB_CALLBACK_URL GitHub OAuth config
ETH_RPC_URL, ETH_INDEXER_* Event indexer tuning
LOG_LEVEL info Logger verbosity
CORS_ORIGIN http://localhost:3000 Allowed origin

docs/DEPLOYMENT.md elaborates on environment-specific overrides and best practices.


Deployment

  • Docker Compose manifests for dev (docker/docker-compose.dev.yml) and prod (docker/docker-compose.yml).
  • App builds TypeScript (pnpm build) and starts via pnpm start.
  • Production container runs migrations (pnpm db:migrate) before launching the server.
  • CI pipeline (GitHub Actions) runs lint, tests, and coverage.
  • Detailed cloud deployment examples and hardening checklists: docs/DEPLOYMENT.md.

Future Improvements & Optimisations

  • Indexer throughput: Introduce adaptive batch sizing driven by observed RPC latency and block gaps.
  • Caching layer: Add Redis-backed caching for frequently queried contract events and rate-limit tokens.
  • Observability: Ship OpenTelemetry tracing + metrics exporters, integrate with Grafana dashboards by default.
  • Secrets management: Replace .env handling with first-class integration to Vault/AWS Secrets Manager.
  • Horizontal scaling: Externalise the event indexer into a dedicated worker service to decouple from the HTTP process.
  • Developer ergonomics: Provide ready-to-use fixture seeding and a pnpm setup command for one-touch onboarding.

Contributions toward these initiatives are welcome—see CONTRIBUTING.md.


Contributing

We follow Conventional Commits, enforce lint/test/format gates, and expect documentation updates alongside feature work. Review CONTRIBUTING.md for the full workflow. Pull requests that keep quality high and maintainability front and centre are always appreciated.


Frequently Asked Questions

How would you scale the event indexer to handle multiple contracts?

  • Run dedicated worker instances per contract or contract group, coordinated via a queue (e.g., BullMQ/RabbitMQ) so indexing tasks can be scheduled and retried independently.
  • Partition state storage (contract_indexer_state, contract_events) by contract address and event signature to keep lookups targeted.
  • Use horizontal scaling: each worker process reads configuration (list of contracts) from a central store and only subscribes to its assigned subset.
  • Introduce rate-limit aware batching with circuit breakers—the current exponential backoff can be extended with central throttling if many workers share the same RPC provider.
  • Add observability (metrics, traces) per contract to identify hot spots and adjust capacity dynamically.

What happens if the Ethereum RPC provider goes down?

  • Detect provider availability using health checks (e.g., periodic eth_chainId calls) and surface degraded status via /health.
  • Implement failover by maintaining a list of RPC endpoints; on repeated failures, rotate to the next provider (Infura, Alchemy, self-hosted node).
  • Maintain durable queues of pending block ranges so when connectivity returns the indexer resumes from the last known state.
  • Alert engineering teams through monitoring when downtime exceeds a threshold, enabling manual intervention or provider escalation.

How do you prevent duplicate event storage?

  • Enforce uniqueness in the database with composite indexes (contract_address, event_signature, block_number, transaction_hash, log_index).
  • Before writing, check the indexer state to avoid reprocessing blocks already confirmed.
  • When ingesting logs, compare against the unique key—failed inserts at the database level should be logged but not considered fatal.
  • Maintain idempotent processing logic (e.g., upserts or ignore-on-conflict strategies) to handle reruns gracefully.

What's your strategy for handling blockchain reorgs in production?

  • Track confirmations: delay persistence until a block reaches a configurable confirmation depth (e.g., 6 blocks) to minimise reorg fallout.
  • Store block hashes alongside events; if a hash mismatch is detected during subsequent indexing, roll back affected events and replay.
  • Keep a reorg queue that reprocesses recent blocks when an inconsistency is detected.
  • Communicate reorg-induced changes to downstream systems (e.g., emit compensating events or maintain an audit log).

How would you monitor the health of the indexer?

  • Export metrics for processed blocks, lag (latest chain head vs indexed block), retry counts, and RPC error rates.
  • Leverage the existing /health endpoints and extend them with indexer-specific status (e.g., isIndexing, lastIndexedBlock).
  • Integrate with observability tooling (Prometheus, Grafana, OpenTelemetry) to chart performance and alert on anomalies.
  • Correlate logs with structured fields (contractAddress, eventSignature, attempt) so issues can be traced quickly.

Built with ❤️ by engineer who care about correctness, resilience, and developer experience.

About

Production-grade backend that covers authentication, GitHub OAuth, and Ethereum smart-contract event indexing. Built with TypeScript, Express, PostgreSQL, and Drizzle ORM; packaged for containerized deployments and tested end to end.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages