📩 SMS Spam Prediction Pipeline

This project is a machine learning pipeline that detects spam SMS messages using natural language processing (NLP) techniques and a Naive Bayes classifier. The model is trained on the UCI SMS Spam Collection Dataset.

🔍 Project Highlights

Text Preprocessing with NLTK: tokenization, stopword removal, stemming
Feature Extraction using CountVectorizer with unigrams and bigrams
Classification Model: Multinomial Naive Bayes
Hyperparameter Tuning with GridSearchCV
Spam Prediction with probability scores
Model Persistence using joblib

🗃️ Dataset

The dataset contains 5,574 labeled SMS messages, split into spam and ham (not spam). It is publicly available from the UCI Machine Learning Repository.

Downloaded and extracted using requests and zipfile
Stored in: sms_spam_collection/SMSSpamCollection

🛠️ How to Run

1. Clone this repository

git clone https://github.com/YOUR_USERNAME/spam-classification.git
cd spam-classification

2. Create and activate a virtual environment

python -m venv myenv
# Windows
myenv\Scripts\activate
# macOS/Linux
source myenv/bin/activate

3. Install dependencies

pip install -r requirements.txt

4. Run the scripts

📥 Download & extract the dataset

python dataset.py

🤖 Train and evaluate the model

python main.py

📊 Example Predictions

The trained model evaluates custom SMS messages and returns:

Spam/Not-Spam label
Probability scores for each class

Example:

Message: Congratulations! You've won a $1000 Walmart gift card.
Prediction: Spam
Spam Probability: 0.98
Not-Spam Probability: 0.02

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
sms_spam_collection		sms_spam_collection
README.md		README.md
dataset.py		dataset.py
main.py		main.py
requirements.txt		requirements.txt
spam_detection_model.joblib		spam_detection_model.joblib

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📩 SMS Spam Prediction Pipeline

🔍 Project Highlights

🗃️ Dataset

🛠️ How to Run

1. Clone this repository

2. Create and activate a virtual environment

3. Install dependencies

4. Run the scripts

📊 Example Predictions

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📩 SMS Spam Prediction Pipeline

🔍 Project Highlights

🗃️ Dataset

🛠️ How to Run

1. Clone this repository

2. Create and activate a virtual environment

3. Install dependencies

4. Run the scripts

📊 Example Predictions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages