Covid-19 Early Detection from X-ray Images Aided by AI
AI age is already here, its implementation is going rapidly in real life.
Covid-19 is an unprecedented situation for society and it’s going to change how human lives for the next generation. It’s also making medical testing has to run quickly in massive test samples, which human can be tired and the machine does not, in a situation like this AI really help.
Intro
In this project I use 2 different datasets, first is Pneumonia datasets that contain chest x-ray for pneumonia and normal condition and the second is COVID-19 x-ray datasets for the covid-19 condition. Before we dive into the processing and modeling stuff it’s better to understand a little bit of the domain knowledge of these datasets.
What is a chest x-ray?

X-rays are a form of radiation like light or radio waves. X-rays pass through most objects, including the body, an x-ray machine produces a small burst of radiation that passes through the body, recording an image on photographic film or a special detector. On a chest x-ray, the ribs and spine will absorb much of the radiation and appear white or light gray on the image. Lung tissue absorbs little radiation and will appear dark on the image.
What are pneumonia and Covid-19?
Pneumonia is an infection of the lungs that may be caused by bacteria, viruses, or fungi. The infection causes the lungs’ air sacs (alveoli) to become inflamed and fill up with fluid or pus. That can make it hard for the oxygen you breathe in to get into your bloodstream.
Coronavirus disease 2019 (COVID-19) is defined as illness caused by a novel coronavirus now called severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2; formerly called 2019-nCoV)
Normal chest X-ray shows normal size and shape of the chest wall and the main structures in the chest, white shadows on the chest X-ray signify solid structures and fluids such as, bone of the rib cage, vertebrae, heart, aorta, and bones of the shoulders. The dark background on the chest X-rays represents air filled lungs.
Now we little bit understand what context of the datasets, why it can be used to detect inflamed lung due to Coronavirus.
Datasets
Kaggle really useful to find the datasets, I combined two different datasets Pneumonia and COVID-19 chest x-ray in order to make the model more robust to recognize between pneumonia, COVID-19 and normal x-ray pattern, I selected only PA x-ray view for COVID-19 data which I consider it’s a clearer image (you can use all the data if you want) and end up with 141 samples for each class.

because the data is really small, data augmentation really helps to make models see different images (still the same image but getting some transformations) in every training loop.

I do many transformations in the training set and little in validation and test set, in order to make the model more robust to different conditions in the image.
Model and Training
I used 3 different pre-trained model architectures Resnet18, mobilenet_v2, and VGG16 and fine-tuning the model over training set and comparing the result.



I trained the models using 360 samples for training and 41 samples for validation with 20 epoch using one cycle learning rate policy then I compare the result for each model.



from loss perspective Resnet18 is the quickest to reach convergence compare to the others and VGG16 is the slowest, all with the same number of epoch.



from accuracy, Resnet18 is also slightly better, validation, and training accuracy in the same track closer to each other (able to generalize) while VGG16 validation is too volatile.
Inferences
After models have been trained, it’s time to test how the model makes inferences over a test set that contains 22 images.
let’s look the confusion matrix



Resnet18 and VGG16 have the same prediction class error on pneumonia class that predicted as a normal condition and mobilenet_v2 miss predicted 4 samples.

From the image above green is for correct prediction and red is for the wrong prediction and the true label is inside the parenthesis.
Resnet18 is better when compared to the others model, even though VGG16 has the same score but VGG16 is a weight model (more parameters).
Conclusion
AI can be very useful for repetitive tasks such as image classification, in the medical examination this AI can really help a lot of physicians and hospitals in rural areas which has limited physicians available.
It is here to liberate us from routine jobs, and it is here to remind us what it is makes us human — Kai-Fu Lee.
Source Code
You can find this article source code in my jovian notebook here if you want to run it just fork and run on Kaggle