Game Player Churn Prediction Project Overview
This project builds a machine learning model to predict player churn using simulated game telemetry data. The goal is to identify behavioral signals that indicate retention risk and generate actionable product insights.
The dataset simulates 5,000 players with engagement, progression, monetization, and recency features.
๐ง Problem Statement
Player churn significantly impacts revenue and long-term growth in games. The objective is to: Predict which players are likely to churn Identify key drivers of churn Propose retention strategies based on data insights ๐ Dataset Features
Engagement
days_active_last_30
total_sessions
avg_session_minutes
Progression
max_level_reached
total_upgrades
Monetization
total_spend
Recency
last_login_days_ago
Categorical
country
device_type
Target variable:
churned (0 = active, 1 = churned)
โ๏ธ Modeling Approach
Label Encoding for categorical variables
Train/Test split (80/20)
Random Forest Classifier
Evaluation using:
Accuracy
Precision / Recall / F1
ROC AUC
Confusion Matrix
Cross-validation for stability
๐ Model Performance
ROC AUC: 0.91
Accuracy: 0.82
Cross-validated ROC AUC: ~0.89โ0.91
The model demonstrates strong predictive capability without feature leakage.
๐ Key Findings
Top churn drivers:
Total sessions
Last login recency
Total spend
Days active last 30
Players with low engagement frequency, high inactivity, and low monetization activity show the highest churn probability.
๐ฏ Product Recommendations
Trigger re-engagement campaigns after 5โ7 days of inactivity
Provide progression boosts to low-session players
Offer retention-focused promotions to mid-spend cohorts
Monitor engagement drop-offs early in lifecycle
๐ Future Improvements
Compare with XGBoost / Gradient Boosting
Add SHAP explainability
Deploy as a Streamlit dashboard
Extend to Lifetime Value (LTV) prediction
๐ Tech Stack
Python
Pandas
NumPy
Scikit-learn
Matplotlib
Seaborn