A starter Python project for vulnerability prioritization / cybersecurity risk scoring built around the workflow described in the uploaded markdown.
- Download and cache official vulnerability data.
- Merge CISA KEV and NVD records into a master table.
- Build structured and text features from CVE metadata and descriptions.
- Train baseline and stronger models for high-priority vulnerability prediction.
- Export interpretable results, plots, and case-study tables.
python -m src.download_data
python -m src.preprocess
python -m src.feature_engineering
python -m src.train_baseline
python -m src.train_xgboost
python -m src.anomaly_detection
python -m src.interpret- This repository is intentionally a credible starter skeleton, not a fully polished production system.
- You will likely need to edit column names and tweak preprocessing after inspecting the real raw files.
- The code favors clarity and modularity so that each file can be debugged independently.
vulnerability-prioritization-project/
├── data/
│ ├── raw/
│ └── processed/
├── notebooks/
│ ├── 01_eda.ipynb
│ └── 02_modeling.ipynb
├── src/
│ ├── download_data.py
│ ├── preprocess.py
│ ├── feature_engineering.py
│ ├── train_baseline.py
│ ├── train_xgboost.py
│ ├── anomaly_detection.py
│ ├── interpret.py
│ └── utils.py
├── results/
│ ├── figures/
│ └── tables/
├── README.md
└── requirements.txt