Data Scientist | Machine Learning | PhD in Sciences
With a PhD in Molecular Biology, I specialize in transforming high-dimensional, noisy data into actionable clinical and technological insights.
- Languages: Python, R
- Data Science: Seurat, Scikit-learn, Pandas, NumPy, PyTorch, SciPy.
- Specialties: Predictive Modeling (Supervised/Unsupervised), Dimensionality Reduction (PCA, t-SNE, UMAP), and Advanced Statistics.
- Engineering & MLOps: Docker, FastAPI.
Clinical Triage API: Pre-eclampsia Risk
Predict pre-eclampsia risk using physiological markers (Copeptin).
Built a Random Forest model, wrapped it in a FastAPI RESTful service, and containerized the entire environment with Docker. Scalable, environment-independent tool for real-time clinical decision support.
PythonFastAPIDockerScikit-Learn
Large-Scale scRNA-seq Data Integration
Identify hidden cellular patterns in massive, noisy single-cell RNA sequencing datasets.
Leveraged unsupervised learning and advanced dimensionality reduction to process high-dimensional sparse matrices.
Developed a robust pipeline for biological discovery in Big Data environments.
Big DataClusteringDimensionality ReductionBioinformatics