Pranjal Verma pvcodes

Pranjal Verma

Data Engineer · GCP Specialist · Open to Opportunities

Data Engineer with 2+ years of production experience designing and operating cloud-native data infrastructure on GCP. Currently at Accenture, building real-time event-driven pipelines and a BigQuery warehouse serving analytical workloads at scale. GCP Professional Data Engineer certified.

Experience

Data Engineer — Accenture Sept 2024 – Present · Pune, India

Redesigned BigQuery data warehouse partitioning and clustering strategy; reduced average query latency by 60% and monthly compute costs by 35%
Built real-time event-driven ingestion pipelines using Cloud Run, EventArc, and Apache Kafka
Orchestrated multi-sink ETL workflows with Cloud Composer (Airflow), delivering to Elasticsearch, MySQL, PostgreSQL, and Kafka topics

Data Engineer — Walkover Jan 2024 – Sept 2024

Designed backend data infrastructure for a workflow automation platform serving 10,000+ concurrent users
Architected fault-tolerant RabbitMQ pipelines, achieving 50% throughput improvement over the prior architecture
Optimized database access patterns with batched reads and writes; 30% faster retrieval under peak load

Selected Projects

Project	Description
VLR Analytics	Cloud-native data lakehouse on GCP using Medallion Architecture (Bronze → Silver → Gold). Processes 500K–1M gaming records via PySpark on Dataproc Serverless. Infrastructure managed with Terraform, orchestration via Cloud Composer.
Medical Risk Prediction	Ensemble ML model (Random Forest, XGBoost, Logistic Regression) on the NHANES dataset (8,000+ records). Achieved 90.4% accuracy and 0.88 AUC-ROC with feature engineering across 145 clinical variables.
ERDiagram-To-Schema	Fine-tuned Qwen2.5-VL to generate database schemas from ER diagram images. Achieved 89.2% table accuracy and 90% relationship accuracy on held-out evaluation set.
ERP-CRM Data Warehouse	Enterprise data warehouse for ERP/CRM source systems with dimensional modeling and dbt-style transformation layers.
Realtime Retail Analytics	End-to-end streaming pipeline with Kafka producers, Spark Structured Streaming consumers, and a live analytics dashboard.
LLMify	Multi-model LLM chatbot platform with unified interface across OpenAI, Anthropic, and Google providers. Per-IP rate limiting, abort signal propagation, and file-based chat persistence.

Stack

Languages — Python · SQL · PySpark · TypeScript · Bash

GCP — BigQuery · Cloud Run · Dataflow · Dataproc · Pub/Sub · Cloud Composer

Data Engineering — Apache Kafka · Apache Airflow · Apache Spark · Terraform · dbt

Databases — PostgreSQL · MySQL · Elasticsearch · Redis

Certifications

GitHub Activity

Education

MCA — Devi Ahilya Vishwavidyalaya, Indore · 2024

BCA — Integral University, Lucknow · 2022

_{Open to Data Engineering, Analytics Engineering, and Platform roles · hi@pvcodes.in}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pranjal Verma pvcodes

Achievements

Achievements

Highlights

Block or report pvcodes