🎬 Movie Recommendation System (Hybrid ML + Web App)

A full-stack movie recommendation system that combines keyword-based similarity and semantic similarity using modern NLP techniques, deployed as a web application using Flask.

This project demonstrates the end-to-end lifecycle of an ML system — from data preprocessing and feature engineering to model inference and web deployment.

🚀 Features

Hybrid recommendation approach:
- Bag-of-Words (CountVectorizer + Cosine Similarity)
- Semantic similarity using Sentence-BERT
Fuzzy matching for robust movie name input
Clean and minimal web interface
Fast inference using precomputed vectors and embeddings
Modular and production-oriented code structure

🧠 Recommendation Strategy

The system computes movie similarity using a weighted hybrid score:

Final Score = 0.6 × BoW Similarity + 0.4 × Semantic Similarity

This balances:

Lexical overlap (genres, keywords, cast, crew)
Contextual meaning (semantic understanding of movie descriptions)

🗂️ Dataset

The project uses multiple CSV files derived from the TMDB movie dataset, including:

Movie metadata
Cast and crew information
Keywords and genres
Poster paths

Since these files are not row-aligned, an ID-based data integration pipeline was implemented to ensure correctness.

⚠️ Raw datasets are excluded due to size and licensing constraints. See data/README.md for preprocessing details.

🏗️ Project Structure

movie-recommender/
│
├── data/
│   ├── processed_movies.csv
│   ├── recommendations.json
│   └── README.md
│
├── notebooks/
│   └── data_preprocessing.ipynb
│
├── models/
│   ├── vectors.npz
│   ├── sbert_embeddings.npy
│   ├── stopwords.pkl
│   └── sbert_model/
│
├── templates/
│   └── index.html
│
├── static/
│   └── script.js
│
├── app.py
├── recommender.py
├── requirements.txt
├── README.md
└── .gitignore

⚙️ Tech Stack

Python
Pandas, NumPy
Scikit-learn
Sentence-Transformers (SBERT)
PyTorch
Flask
HTML, CSS, JavaScript

⚡ Performance Optimizations

Offline preprocessing and feature extraction
Cached BoW vectors and SBERT embeddings
Models and data loaded once at server startup
No recomputation during user requests

This design ensures low latency and scalability for web deployment.

▶️ How to Run Locally

1️⃣ Clone the repository

git clone https://github.com/your-username/movie-recommender.git
cd movie-recommender

2️⃣ Create and activate environment

conda create -n movie-recommender python=3.10
conda activate movie-recommender

3️⃣ Install dependencies

pip install -r requirements.txt

4️⃣ Run the application

python app.py

Visit:

http://127.0.0.1:5000

🧪 Example Output

[
  {
    "title": "Interstellar",
    "poster": "https://image.tmdb.org/t/p/w500/xyz.jpg",
    "score": 0.87
  }
]

📌 Key Learnings

Importance of ID-based data alignment
Hybrid recommendation system design
ML model deployment pitfalls and fixes
Flask + ML integration best practices
Performance optimization for inference-time systems

🔮 Future Improvements

User-based collaborative filtering
Search auto-completion
Explanation for recommendations
Cloud deployment (Docker / Render)
User feedback loop

👤 Author

Harshavardhan B.Tech CSE, NIT Trichy Interests: NLP, Recommendation Systems, ML Systems, Applied AI

⭐ Acknowledgements

TMDB for the dataset
Sentence-Transformers library
Open-source ML community

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎬 Movie Recommendation System (Hybrid ML + Web App)

🚀 Features

🧠 Recommendation Strategy

🗂️ Dataset

🏗️ Project Structure

⚙️ Tech Stack

⚡ Performance Optimizations

▶️ How to Run Locally

1️⃣ Clone the repository

2️⃣ Create and activate environment

3️⃣ Install dependencies

4️⃣ Run the application

🧪 Example Output

📌 Key Learnings

🔮 Future Improvements

👤 Author

⭐ Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
Screenshots		Screenshots
data		data
models		models
notebooks		notebooks
static		static
templates		templates
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
app.py		app.py
recommender.py		recommender.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🎬 Movie Recommendation System (Hybrid ML + Web App)

🚀 Features

🧠 Recommendation Strategy

🗂️ Dataset

🏗️ Project Structure

⚙️ Tech Stack

⚡ Performance Optimizations

▶️ How to Run Locally

1️⃣ Clone the repository

2️⃣ Create and activate environment

3️⃣ Install dependencies

4️⃣ Run the application

🧪 Example Output

📌 Key Learnings

🔮 Future Improvements

👤 Author

⭐ Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages