Skip to content

Harshavardhan001457/Movie_Recommendation

Repository files navigation

🎬 Movie Recommendation System (Hybrid ML + Web App)

A full-stack movie recommendation system that combines keyword-based similarity and semantic similarity using modern NLP techniques, deployed as a web application using Flask.

This project demonstrates the end-to-end lifecycle of an ML system — from data preprocessing and feature engineering to model inference and web deployment.


🚀 Features

  • Hybrid recommendation approach:

    • Bag-of-Words (CountVectorizer + Cosine Similarity)
    • Semantic similarity using Sentence-BERT
  • Fuzzy matching for robust movie name input

  • Clean and minimal web interface

  • Fast inference using precomputed vectors and embeddings

  • Modular and production-oriented code structure


🧠 Recommendation Strategy

The system computes movie similarity using a weighted hybrid score:

Final Score = 0.6 × BoW Similarity + 0.4 × Semantic Similarity

This balances:

  • Lexical overlap (genres, keywords, cast, crew)
  • Contextual meaning (semantic understanding of movie descriptions)

🗂️ Dataset

The project uses multiple CSV files derived from the TMDB movie dataset, including:

  • Movie metadata
  • Cast and crew information
  • Keywords and genres
  • Poster paths

Since these files are not row-aligned, an ID-based data integration pipeline was implemented to ensure correctness.

⚠️ Raw datasets are excluded due to size and licensing constraints. See data/README.md for preprocessing details.


🏗️ Project Structure

movie-recommender/
│
├── data/
│   ├── processed_movies.csv
│   ├── recommendations.json
│   └── README.md
│
├── notebooks/
│   └── data_preprocessing.ipynb
│
├── models/
│   ├── vectors.npz
│   ├── sbert_embeddings.npy
│   ├── stopwords.pkl
│   └── sbert_model/
│
├── templates/
│   └── index.html
│
├── static/
│   └── script.js
│
├── app.py
├── recommender.py
├── requirements.txt
├── README.md
└── .gitignore

⚙️ Tech Stack

  • Python
  • Pandas, NumPy
  • Scikit-learn
  • Sentence-Transformers (SBERT)
  • PyTorch
  • Flask
  • HTML, CSS, JavaScript

⚡ Performance Optimizations

  • Offline preprocessing and feature extraction
  • Cached BoW vectors and SBERT embeddings
  • Models and data loaded once at server startup
  • No recomputation during user requests

This design ensures low latency and scalability for web deployment.


▶️ How to Run Locally

1️⃣ Clone the repository

git clone https://github.com/your-username/movie-recommender.git
cd movie-recommender

2️⃣ Create and activate environment

conda create -n movie-recommender python=3.10
conda activate movie-recommender

3️⃣ Install dependencies

pip install -r requirements.txt

4️⃣ Run the application

python app.py

Visit:

http://127.0.0.1:5000

🧪 Example Output

[
  {
    "title": "Interstellar",
    "poster": "https://image.tmdb.org/t/p/w500/xyz.jpg",
    "score": 0.87
  }
]

📌 Key Learnings

  • Importance of ID-based data alignment
  • Hybrid recommendation system design
  • ML model deployment pitfalls and fixes
  • Flask + ML integration best practices
  • Performance optimization for inference-time systems

🔮 Future Improvements

  • User-based collaborative filtering
  • Search auto-completion
  • Explanation for recommendations
  • Cloud deployment (Docker / Render)
  • User feedback loop

👤 Author

Harshavardhan B.Tech CSE, NIT Trichy Interests: NLP, Recommendation Systems, ML Systems, Applied AI


⭐ Acknowledgements

  • TMDB for the dataset
  • Sentence-Transformers library
  • Open-source ML community

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors