Pneumonia Detection: CNN vs. Medical Professionals

This repository contains the code and analyses for the project in Introduction to Intelligent Systems (02461) at DTU, January 2025.

Authors:

Valdemar Stamm Kristensen (s244742)
Frederik Lysholm Jønsson (s245362)
William Hoffmann Hyldig (s245176)

Study line: Artificial Intelligence and Data

📌 Project Overview

This project explores the use of Convolutional Neural Networks (CNNs) to detect pneumonia in chest X-ray images and compares the model’s performance to that of two medical doctors.

The motivation lies in pneumonia’s status as a major global health challenge, particularly among children, where fast and reliable diagnosis is crucial. CNNs offer a way to assist doctors in handling the large volume of X-ray data in clinical settings.

🧠 Key Methods

Dataset: Kaggle chest X-ray dataset with 5,116 images (73% pneumonia, 27% normal).
Preprocessing: Resizing (244×244), normalization, and tensor conversion.
Model Architecture: Custom CNN inspired by VGG-16, reduced in complexity to prevent overfitting.
Training Distributions: Compared original (73/27) vs. balanced (50/50) data splits.
Evaluation: Compared CNN predictions against two non-specialist medical doctors reviewing 100 X-rays each.

📊 Results in Brief

CNN Performance: Achieved 96.08% ± 1.57 test accuracy.
Medical Doctors: Achieved 67–72% ± ~9 accuracy.
Observations:
- 73/27 model achieved highest overall accuracy but showed bias toward pneumonia.
- 50/50 model was more stable across distributions but required discarding data.
- Overfitting was observed after epoch 7 → optimal stopping point identified.

These results confirm that CNNs can significantly outperform non-specialist medical professionals in classifying pneumonia from X-rays.

📂 Repository Structure

CNN-model.ipynb → Full Jupyter Notebook with preprocessing, model design, training, evaluation, and visualizations.
CNN-model.py → Same code as CNN-model.ipynb but in a python file.

🔬 Reflections and Future Work

Dataset limitations: Current data originates from one hospital → limited generalizability.
Bias risks: Model may perform differently across age, gender, and imaging conditions.
Future improvements:
- Larger and more diverse datasets across multiple hospitals.
- Use of data augmentation (flipping, rotation, contrast adjustments).
- Certainty scores and explainable AI (XAI) to improve clinical trust and usability.

📖 References

Kaggle Pneumonia Dataset: Mooney, P. (2018). Chest X-Ray Images (Pneumonia).
Sharma, A. (2024). Pneumonia Detection using VGG16 Transfer Learning.
UNICEF (2021). Pneumonia statistics on child mortality.
Additional references listed in the project report.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.gitattributes		.gitattributes
CNN-model.ipynb		CNN-model.ipynb
CNN-model.py		CNN-model.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pneumonia Detection: CNN vs. Medical Professionals

📌 Project Overview

🧠 Key Methods

📊 Results in Brief

📂 Repository Structure

🔬 Reflections and Future Work

📖 References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Pneumonia Detection: CNN vs. Medical Professionals

📌 Project Overview

🧠 Key Methods

📊 Results in Brief

📂 Repository Structure

🔬 Reflections and Future Work

📖 References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages