Ahmed Yousri Sobhi AhmedYousriSobhi

I'm an ML Engineer who ended up running GPU infrastructure at a scale most ML people only read about. That path wasn't planned — it happened because the problems at the systems layer turned out to be more interesting than I expected.

Right now I work at the edge of HPC and MLOps: I've kept 1,500+ GPU nodes running in production (H100, H200, AMD MI210, DGX SuperPod), handled distributed training verification for DeepSpeed ZeRO and FSDP workloads, and built the tooling teams actually use — not the polished kind, the kind that fixes a broken GPU reporting pipeline at 2am before a client review.

Currently pushing toward full LLMOps: RAG pipelines, model registries, and making inference actually deployable at scale.

What I Work With

Core Languages & Tools

ML & Data Stack

Infrastructure & Monitoring

Domain	Tools
HPC Scheduling	SLURM, GRES, sacct/sacctmgr, NVIDIA BCM, AWX/Ansible
Distributed Training	DeepSpeed ZeRO (1/2/3), FSDP, multi-node GPU setups
LLM Inference	Ollama, model serving, API workflows, GPU-aware env setup
Cluster Storage	DDN Lustre, parallel I/O, Singularity, containerized workloads
Benchmarking	HPL, RCCL, STREAM, CUDA benchmarks, Intel MPI
ML Modeling	Prophet, LSTM, XGBoost, CNNs, SageMaker pipelines

GitHub Stats

Projects Worth Looking At

Currently Building

RAG pipeline — local LLM inference with retrieval, no cloud API dependency
GPU utilization reporter — accurate per-user consumption from SLURM GRES records (generalizing the fix I built for production)
Finishing my MS in Electrical Engineering (AI track) at Ain Shams University

Background

BrightSkies / Core42 (G42) — Senior HPC Systems Engineer, Azure H100/H200/MI210 clusters (Abu Dhabi)
BrightSkies / SDAIA — Sole technical owner, 60-node DGX H100 SuperPod (Riyadh)
BrightSkies / KAUST — HPC support + LLM inference tooling for research clusters
elmenus — Data Scientist, demand forecasting and operational ML
Omdena — ML Engineer, applied projects in computer vision and time-series forecasting
BSc Electrical Engineering, Alexandria University — GPA 3.4, Very Good with Honours

Alexandria, Egypt — open to remote and hybrid roles

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ahmed Yousri Sobhi AhmedYousriSobhi

Achievements

Achievements

Organizations

Block or report AhmedYousriSobhi

What I Work With

GitHub Stats

Projects Worth Looking At

Currently Building

Background

Pinned Loading

Uh oh!