Skip to content
View lannd3217's full-sized avatar

Block or report lannd3217

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please donโ€™t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this userโ€™s behavior. Learn more about reporting abuse.

Report abuse
lannd3217/README.md

Hi there ๐Ÿ‘‹ I'm Lan

Welcome to my GitHub! Iโ€™m a Data Scientist & Machine Learning Enthusiast passionate about turning raw data into meaningful insights. With experience in data analysis, machine learning, and visualization, I love solving complex problems and making data-driven decisions.

๐Ÿš€ About Me

๐ŸŽ“ Education:

  • M.S., Computer Science (Machine Learning) | Georgia Tech (Expected Dec 2027)
  • B.A., Data Science (Business & Industrial Analysis) | University of California, Berkeley (Dec 2024)

๐Ÿ’ก What I Do:

  • Build machine learning models to solve real-world problems
  • Design data pipelines for predictive analytics
  • Develop interactive visualizations to make data more accessible
  • Apply statistical analysis & A/B testing for data-driven insights

๐Ÿ›  Tech Stack:

  • Languages: Python, SQL
  • Frameworks & Libraries: Pandas, NumPy, Scikit-learn, PyTorch, LangChain
  • ML & AI: Retrieval-Augmented Generation (RAG), semantic search, embeddings (sentence-transformers/all-MiniLM-L6-v2), LLM prompt engineering
  • Databases & Infra: ChromaDB (vector store), persistent local storage
  • Tools: Tableau, Jupyter Notebook, Google Colab, Git, Hugging Face ecosystem

๐Ÿ“‚ Featured Projects

A retrieval-augmented generation (RAG) AI mentor that guides job seekers through data science interview preparation using knowledge from textbooks, career advices and online community discussions such as Reddit, Quora, etc. Unlike generic chatbots, this mentor always cites its sources, building trust through transparency. Addresses a critical gap: quality career mentorship is scarce and limited by volunteer availability, but this tool scales guidance to anyone, anytime.

This project investigates the relationship between air quality (measured through PM2.5 levels) and socioeconomic mobility across California counties. By combining environmental and economic datasets, I aim to understand how exposure to poor air quality during childhood influences long-term economic outcomes. My analysis employs causal inference and predictive modeling techniques to assess and quantify these relationships.

This project analyzes and predicts chronic absebteeism among students in schools within the Oakland District. By leveraging historical attendance, demographics, and academic data, I build machine learning models to predict absenteeism risks and provide insights for early intervention.

๐Ÿ“ซ Letโ€™s Connect!

๐Ÿ’ผ LinkedIn | โœ‰๏ธ Email | ๐Ÿ”— GitHub

Pinned Loading

  1. Interview_RAG Interview_RAG Public

    Jupyter Notebook

  2. chronic-absenteeism chronic-absenteeism Public

    Analyze and identify students at risk of chronic absenteeism

    HTML

  3. Machine-Learning-Projects Machine-Learning-Projects Public

    Jupyter Notebook

  4. Inference-and-Prediction-of-PM2.5-Exposure-on-Youth-s-Mobility-Outcomes Inference-and-Prediction-of-PM2.5-Exposure-on-Youth-s-Mobility-Outcomes Public

    Jupyter Notebook

  5. Data-Engineering Data-Engineering Public

    Jupyter Notebook

  6. Data-Science-Projects Data-Science-Projects Public

    Jupyter Notebook

โšก