Welcome to the Credit Card Default Prediction repository! 🎉
This project is a collaborative initiative brought to you by SuperDataScience, a thriving community dedicated to advancing the fields of data science, machine learning, and AI. We are excited to have you join us in this journey of learning, experimentation, and growth.
In this repository, you’ll find the work carried out by our community members in completing an end-to-end machine learning project. Please note that the resources available here is not to be used or copied without referencing the appropriate authors.
Build an ML classification model that accurately predicts the customers who default to using their credit cards
Link to Dataset: https://archive.ics.uci.edu/dataset/350/default+of+credit+card+clients
Data Cleaning & Analysis (Week 1)
- Handling null values, fixing data types, data inconsistencies
- EDA, understanding distributions, outliers, relationships of features with target variable
- Feature Selection using correlation analysis, ANOVA tests, F-test, etc.
Feature Engineering & Model Selection (Week 2 & 3)
- Building new features, one hot encoding, feature scaling
- Handling outliers through statistical and heuristic methods
- Normalisation or Standardisation of input features is the ML algorithm in the pipeline requires it
- Model training, comparison and selection
- Model evaluation and Optimization using K-fold cross validation, hyperparameter tuning
Deployment (Week 4)
- Building a streamlit app
- Deploying model to app
- Deploying app to streamlit cloud