Skip to content
View rakeshutekar's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report rakeshutekar

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
RakeshUtekar/README.md

header

typing-svg

profile-views   linkedin github email


AI Engineer working across the stack on RAG, LLM fine-tuning, and real-time speech systems. My work has moved from computer-vision and speech models toward applied GenAI — fine-tuning open LLMs, building retrieval pipelines on vector DBs, and shipping low-latency audio workflows to cloud. Based in San Francisco, CA.

Experience

  • AI FundTechnical Builder / AI Engineer (Dec 2025–present, Mountain View, CA)
  • Ditto AIAI Engineer (May–Dec 2025, Berkeley, CA)
  • JetskiAIFounding AI/ML Engineer (Mar–Dec 2025, SF Bay Area)
  • SuperIntroAI Software Engineer (Dec 2024–Apr 2025, SF) — Fine-tuned Qwen 2.5 LLM and Stable Diffusion pipelines on Vertex AI with LightRAG; deployed fine-tuned models on GCP with cloud logging and monitoring; integrated via Azure AI Foundry.
  • SizzleAI Engineer (Jan–Mar 2025, SF) — Designed a low-latency Whisper + Qwen 2.5 + BERT workflow with Librosa/TorchAudio feature extraction; +15% metadata-tagging precision, +20% acoustic-linguistic alignment.
  • Melp App, Inc.Software Developer (AI/ML) (May–Jun 2025, SF Bay Area)
  • Seattle UniversityResearch Assistant (Aug 2024–Dec 2025, Seattle, WA) — Under Prof. Pejman Khadivi: fine-tune Transformers and CNNs for NLP and predictive analytics, with emphasis on automation and model deployment.
  • Seattle UniversityTeaching Assistant, Visual Analytics (Mar–Jun 2024, Seattle, WA)
  • SlashRTCMachine Learning Engineer (Sep 2021–Aug 2022) / ML Intern (Jun–Aug 2021), Mumbai, India — Built Speech-to-Text models with Python and TensorFlow.

Flagship Projects

Real-Time Sign Language → Speech

Problem: Bridge communication for the deaf and hard-of-hearing by translating American Sign Language signs into spoken audio.

Approach: Fine-tune an I3D (Inflated 3D ConvNet) on the WLASL dataset for word-level sign recognition, piping predictions into a TTS stage. I3D captures spatiotemporal features across stacked video frames rather than treating frames independently, which suits the motion-heavy nature of signing.

Stack: PyTorch · I3D · WLASL · OpenCV

View repo →

Real-Time Speech → Speech Translation

Problem: Enable live cross-language conversation without the stop-and-wait of batch translation.

Approach: A streaming pipeline chains Whisper (ASR) → translation → OpenAI TTS, with audio streamed in and out continuously. It prioritizes low end-to-end latency by keeping the stages pipelined rather than processing each utterance as a discrete block.

Stack: Whisper · OpenAI TTS · Python · streaming audio I/O

View repo →

Tech Stack

Languages

ML / Deep Learning

GenAI / LLM

Speech / Audio · Vision

Cloud / Infra

🤝 Open to Collaborate

I enjoy giving back to the AI/ML community and am always happy to:

  • 🏆 Judge hackathons, demo days, and AI/ML competitions
  • 🎤 Give interviews, talks & guest sessions on applied GenAI, RAG, and real-time speech
  • 🧭 Mentor & guide engineers and students breaking into AI/ML
  • 💡 Consult & advise on AI/ML product direction and architecture

📫 Reach me at rakeshutekar60@gmail.com or on LinkedIn.

📊 GitHub Analytics

profile-details

stats productive-time

repos-per-language most-commit-language

streak

activity-graph

🟡 Watch Pac-Man eat my contributions

Pac-Man eating my GitHub contribution graph

Education

  • MS, Computer Science (Data Science Specialization) — Seattle University (Sep 2022–Aug 2024)
  • BTech, Computer Engineering — University of Mumbai (2016–2021)

footer

Pinned Loading

  1. Speech-To-Speech-Translation-real-time- Speech-To-Speech-Translation-real-time- Public

    This a Speech to Speech Translation Application which translates any-language to any-language in Real time. This application is build using Python and openai APIs

    Python 23 9

  2. claude-sprint-orchestrator claude-sprint-orchestrator Public

    Multi-agent sprint orchestration for Claude Code — /tickets plans, /sprint executes

    Shell 12 1

  3. Sign-Language-to-Speech-Translation-Real-time- Sign-Language-to-Speech-Translation-Real-time- Public

    This project converts sign language gestures into spoken words in real-time using the I3D model fine-tuned on the WLASL dataset, bridging the communication gap between individuals with hearing impa…

    Python 7 1

  4. RAG-based-PDF-Query-System RAG-based-PDF-Query-System Public

    This project implements a Retrieval-Augmented Generation (RAG) system that allows users to upload multiple PDF files, extract and preprocess the text, and then query the contents of those PDFs usin…

    Python 4 5

  5. nichlosho/Ecolens nichlosho/Ecolens Public

    Ecolens E-Commerence website

    TypeScript 1

  6. Image-Segmentation-with-U-Net Image-Segmentation-with-U-Net Public

    This project demonstrates image segmentation using the U-Net architecture for medical image analysis. The dataset used is the Data Science Bowl 2018 dataset, which consists of images of cell nuclei.

    Python 5