Skip to content
View taherb22's full-sized avatar

Highlights

  • Pro

Block or report taherb22

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
taherb22/README.md

Mohamed Taher Boudrigua

Applied AI Engineer · MLOps · NLP Research
Final-year Engineering Student @ SUPCOM


Research Interests

  • Applied AI & NLP Systems
  • MLOps & Model Deployment
  • LLM Fine-Tuning & Pretraining
  • Cloud Infrastructure & DevOps
  • AI for Software Engineering

Current Work

🔬 End-of-Studies Research Project @ PCP Consulting
Pretraining a 100M-parameter domain-specific SLM from scratch for DevOps automation — LLaMA architecture, custom BPE tokenizer (GPT-2/StarCoder/SantaCoder literature), split AdamW+Muon optimizer, BOS-aligned sequence packing, compute-optimal tokens-to-params ratio of 10.5.


Featured Projects

🧠 Efficient Fine-Tuning of LLaMA 3B

Systematic investigation of LoRA and QLoRA for domain adaptation under constrained compute. 4-bit quantization (bitsandbytes), PEFT, Unsloth. Full analysis documented.
Kaggle Notebook

🤖 AI-Powered Jenkins Pipeline Auditor

Multi-agent CI/CD security analysis system built with LangChain + LangGraph. FastAPI backend integrated directly into Jenkins pipelines.
GitHub Repo

☁️ Automated Kubernetes Cluster on Azure

End-to-end automated Kubernetes cluster deployment on Azure using IaC — Terraform for provisioning, Ansible for configuration, Jenkins pipelines for orchestration.
GitHub Repo

📦 Domain-Specific Pretraining Corpus (~3.5B tokens)

End-to-end data pipeline: scraping, parsing, normalising technical documentation (AWS, Azure, GCP, Kubernetes, Terraform). Multi-layered quality validation + synthetic data generation via Qwen 2.5 Coder on GCP Vertex AI.


Tech Stack

Languages     Python · C/C++ · SQL · Bash
ML / NLP      PyTorch · Hugging Face · LLM Pretraining · BPE Tokenization · PEFT · RAG
AI Frameworks LangChain · LangGraph
Cloud/DevOps  GCP Vertex AI · Docker · Jenkins · FastAPI · Git · Kubernetes · Terraform · Ansible

Competitive Programming

  • 🥇 Ranked 27th — Tunisian Collegiate Programming Contest (TCPC) 2024
  • 🏅 Ranked 30th — Tunisian Collegiate Programming Contest (TCPC) 2025

Codeforces Profile


Links

LinkedIn GitHub Kaggle Email

Pinned Loading

  1. AI-powered-jenkins-pipeline-auditor AI-powered-jenkins-pipeline-auditor Public

    Python 1

  2. E-commerce_website-backend E-commerce_website-backend Public

    Java 1

  3. E-commerce_website-front-end E-commerce_website-front-end Public

    JavaScript

  4. devops_project devops_project Public

    Java

  5. kubernetes-azure-automation kubernetes-azure-automation Public

    Forked from yassineamri722/k8s-automation-project