A production-grade machine learning system for predicting cancer metastasis from gene mutation data, built with microservices architecture and automated deployment on Google Cloud Platform.
This project demonstrates a complete MLOps pipeline that separates concerns into three independent microservices:
- Training Service: Automated model training with experiment tracking
- MLflow Server: Centralized experiment tracking and model registry
- Inference API: Production-ready REST API for predictions
The system processes gene mutation data to predict metastasis probability using custom scikit-learn pipelines with advanced preprocessing and automated hyperparameter optimization.
Note: This is a demonstration project showcasing production MLOps architecture and best practices. The GCP infrastructure is not currently running to avoid ongoing costs.
โโโโโโโโโโโโโโโโโโโโโโโ
โ Training Service โ
โ (Docker Container) โ
โ โ
โ โข Hydra Config โ
โ โข Custom Pipelines โ
โ โข Auto HP Tuning โ
โโโโโโโโโโโโฌโโโโโโโโโโโ
โ
โ Logs experiments
โผ
โโโโโโโโโโโโโโโโโโโโโโโ
โ MLflow Server โ
โ (Docker Container) โ
โ โ
โ โข Experiment Track โ
โ โข Model Registry โ
โ โข Version Control โ
โโโโโโโโโโโโฌโโโโโโโโโโโ
โ
โ Loads best model
โผ
โโโโโโโโโโโโโโโโโโโโโโโ
โ Inference API โ
โ (Docker Container) โ
โ โ
โ โข FastAPI Server โ
โ โข Model Serving โ
โ โข REST Endpoints โ
โโโโโโโโโโโโโโโโโโโโโโโ
All services designed for deployment on GCP Compute Engine VMs with internal VPC networking for secure service-to-service communication.
- Separation of Concerns: Training, tracking, and inference as independent services
- Containerization: Each service runs in its own Docker container
- Cloud-Native: Designed for GCP deployment with infrastructure automation
- Internal Networking: Secure VPC communication between services
- Modular Configuration: Hydra dataclass-based config store for composable pipeline components
- Custom Transformers: Gene mutation preprocessing using sklearn's BaseEstimator and TransformerMixin
- Automated Hyperparameter Tuning: Hydra sweep across multiple YAML configurations for different model architectures
- Experiment Tracking: All runs logged to MLflow with metrics, parameters, and artifacts
- Model Versioning: Best models automatically tagged based on F1 score optimization
- Gene mutation text preprocessing
- TF-IDF vectorization for mutation patterns
- Dimensionality reduction (PCA, NMF)
- SMOTE for handling class imbalance
- Multiple classifier architectures (Logistic Regression, Random Forest, SVM)
- Centralized experiment tracking across all training runs
- Model registry with versioning (best_v1, best_v2, ..., best_v10)
- Artifact storage for trained models and preprocessing pipelines
- Metric comparison
- FastAPI REST endpoints with automatic OpenAPI documentation
- Pydantic validation for type-safe request/response handling
- Automatic model loading from MLflow registry
- Returns probability predictions for metastasis classification
- Makefile-driven workflow: Single command builds, deployments, and teardowns
- Docker containerization: Consistent environments across development and production
- GCP VM automation: Automated provisioning and configuration via startup scripts
- One-command deployment: From code to running infrastructure with minimal manual intervention
- scikit-learn: Core ML framework and pipeline architecture
- imbalanced-learn: SMOTE for class imbalance handling
- pandas/numpy: Data manipulation and numerical computing
- MLflow: Experiment tracking and model registry
- Hydra: Configuration management and hyperparameter optimization
- FastAPI: High-performance REST API framework
- Pydantic: Data validation and settings management
- Docker: Application containerization
- GCP Compute Engine: Virtual machine hosting
- GCP Artifact Registry: Docker image storage and distribution
- Make: Build automation and deployment orchestration
- Poetry: Dependency management with pyproject.toml
- Git: Version control
This repository contains the inference API component of the MLOps platform:
metastasis-prediction-api/
โโโ src/
โ โโโ server.py # FastAPI application
โโโ Dockerfile # Container definition
โโโ Makefile # Deployment automation
โโโ pyproject.toml # Poetry dependencies
โโโ startup-script.sh # GCP VM startup configuration
โโโ README.md # This file
Related Repositories:
- Training Service: [https://github.com/amir2520/liver_metas]
Note: This is part of a multi-repository MLOps system. Each microservice is maintained in its own repository for independent deployment and versioning.
The system was designed with full automation for GCP deployment:
- Build: Docker images built locally or in CI/CD
- Push: Images pushed to GCP Artifact Registry
- Deploy: VMs provisioned with startup scripts that pull and run containers
- Network: Internal VPC networking configured for service communication
- Access: SSH tunnels for secure access to services
All steps automated via Makefiles for reproducible infrastructure.
- Independent Scaling: Training runs are resource-intensive but intermittent; API needs constant availability
- Technology Flexibility: Each service can use different resource requirements
- Fault Isolation: API remains available even if training service fails
- Deployment Independence: Update one service without redeploying others
- Composability: Mix and match preprocessing, models, and hyperparameters
- Reproducibility: Each experiment configuration saved as YAML
- Sweep Capability: Built-in grid/random search across configurations
- Type Safety: Dataclass-based configs catch errors early
- Experiment Tracking: Compare dozens of model runs easily
- Model Registry: Version control for trained models
- Artifact Storage: Keep models and preprocessing pipelines together
- Production Integration: Load models directly into serving API
The pipeline was used to train and compare multiple model architectures:
- Multiple model configurations tested with different preprocessing and hyperparameters
- F1 score optimization for balanced precision/recall on imbalanced medical data
- Automated model selection based on validation metrics
- Version tracking for reproducibility and rollback capability
Completed:
- โ Microservices architecture design and implementation
- โ Training pipeline with Hydra configuration system
- โ Custom sklearn transformers for gene mutation processing
- โ MLflow integration for experiment tracking
- โ FastAPI inference service
- โ Docker containerization for all services
- โ GCP deployment automation with Makefiles
- โ Internal VPC networking configuration
Amir Fatemi
- Email: hossein.fatemi85@gmail.com
- LinkedIn: linkedin.com/in/amirfatemi2520
- GitHub: github.com/amir2520
This project demonstrates production-grade MLOps practices including microservices architecture, experiment tracking, automated deployment, and infrastructure as code.