This is an open-source DL-based classification model that tries to identify patients with COVID-19 viral infection, non-COVID-19 infection, and those with no infection with high accuracy by analyzing their chest x-ray scans. This project is part of COVID-Net open source initiative. This is a prototype model, not intended to be used yet in production. You can reach out to me directly on LinedIn if you have questions about this model or want help to getting it used as an experimental tool for COVID-19 near real-time screening.
- The architecture was adapted from ShuffleNet v2
- Designed with efficiency in mind to allow near real-time screening with mobile devices
- Experimented with transfer learning and data augmentation techniques to improve generalizability and robustness
- This project is built using open-source software, where PyTorch was used as the main AI framework
- Training and testing datasets relied on COVIDx dataset which consists of chest x-ray images from 3 publicly available data.
- For model evaluation, I relied on the images listed in
test_COVIDx2.txtas the blind testset used for evaluation.
-
datadirectory contains 3 important scripts:create-db.pyis a slightly modified verion of the original instructions that was used to get the maintrainandtestdatafolders. Make sure you do this first.create-trainsets.pyis uesed to create thetrainsetwhich includestrnandvalsubfolders with images arranged by class. There is code that can be used to get a balanced version of the data that you can later on pass directly to an augmentation method.create-covidx2-testset.pyis used to get theCOVIDx2_testdataset which this and the other COVIDNet models use as a blind testsettrain_split_v3.txtcontains the list of images used to create thetrainset. Thetest_split_v3.txtwas never used sincetest_COVIDx2.txtis a subset of it.- For more details on COVIDx dataset and the original instructions, checkout the COVID-Net repo.
-
modelsdirectory contains the pretrained models. -
project requirements can be found in
requirements.txt. -
train.pycontains the entire training pipeline, which includes dataloaders, preprocessing, augmentations, hyperparameters, main train loop, and model saving mechanism. -
test.pycontains a simple prediction script that reads a directory of images created bycreate-covidx2-testset.py.
- covidnet-cxr-shuffle-e18


