Introduction
In this blog, we would broadly cover a business use case for the use of X-Rays for Covid Classification and Covid Stage Categorization. The background work for such an activity involves collection of XRay Data for both Normal & Covid Patients and then referring a Subject Matter Expert (Field Technicians) or a doctor to understand the Region of Interests or Observation for Analysis and / or Prediction. The Ground Truth Labels or Annotations forms a pre-requisite step for Supervised Algorithm.
Objectives & Motivation
The goal of this particular Business use case would be to Develop a Robust Classifier for Covid patients’ Identification with a Recommendation for Hospitalization on top of this Predictive Model.
The motivation behind using a Computer Vision (using OpenCV) driven approach for Diagnosis includes the following value additions
❖ Technique would act as an aid to the existing and more acceptable RT-PCR Technique
❖ A Quick Predictive Engine can help when the reliable technique (RT-PCR) takes more time
❖ Valuable Second Opinion to the Doctors
❖ Make use of a reliable input feed in form of X-Ray which has long been a proven testing procedure for pneumonia subjects
❖ Computer Vision aided techniques can prove to be beneficial when challenged on the human resources front. Manual investigation and Diagnosis even for Radiography take time and can not only be expedited but also can be used to process Radiography images on a
large scale.
❖ The Technology driven approaches can also be extended so that any Specialist or Primary Physician sitting at a remote place can have access and take Decision during any Emergency.
Around COVID-19, various Research Teams have already been working or have developed modules to address some of the rising concerns that the pandemic brought along with it. Few of the research works have been given in Figure I. We would focus only on the Detection & Clustering part.
Solution Pipeline
The Block Diagram or the info graph encapsulating the entire process flow pipeline in Figure II.
Accessing & Exploring of Data mainly encompasses the interface of the Python environment with various Data storage Repositories. The same is outside the scope of this blog. Additionally, development of the Predictive Model, its Training & Tuning can be covered in some of the future blogs as per request from readers. The Deployment / Integration of the Model talks about the development of Desktop or Mobile based Apps, Web based APIs / Deployment to Cloud as a SaaS (Software as a Service) application or even Deployment on Embedded platforms per say are extensive articles and need to be covered separately in an elaborate fashion on request from readers. Needless to mention that Python would give you the entire environment for the end-to-end development. The topics are highly streamlined hence requires focussed attention.
Our focus will be mainly on the Data Preparation viz-a-viz Pre-processing & Feature Engineering side which needs more than 80 % of time investment for a thorough Model development. The Data feed once healthy can be used further for development of Robust Models. OpenCV offers an exhaustive population of the state-of-the-art algorithms which can be used for Data Pre-processing & Feature Engineering. Moreover, one more additional step can be addressed by OpenCV in this application which forms a part of the Analytics pipeline. This step involves Image Segmentation from a broader angle.
Implementation
Let’s begin with the Data Pre-processing step. It goes without saying that X-Rays, captured by machines may be infused with some noise resulting in the degradation of the image and thus making the overall quality poor. The Doctors with their clinical experience and knowledge can ignore the noise factor. But if the Scientists are to develop an automated solution for X-Ray Analysis & Processing, needless to mention that Noise Removal would be one of the essential steps. OpenCV offers various Noise Removing Algorithms using Linear Filters of Gaussian type to Non-Linear Filters of Median category and even Adaptive Filtering Techniques like Wiener Filter, Kalman filter etc. Depending on the degree of degradation by studying various characteristics like information from
3. Image Histogram, variance of Intensity of Image post application of Laplacian Operator on image , the choice of Filter can be made. OpenCV has function to visualize Image Histogram. Since Filtering involves smoothing of image intensities hence may result in a blurring effect as an aftermath of the filtering process. To negate the effect, various enhancement techniques like Histogram Stretching, Histogram Matching, Histogram Equalization or even Adaptive Histogram Equalization can be employed. Needless to mention that OpenCV has specific functions adhering to the mentioned algorithms.
Once this step is accomplished with desired results, the medical image can be fed to the next step which would do the Feature Extraction for the rest of the process pipeline to act on the same. The Image Features can be of two types Global & Local. This is what the Feature Extraction exactly achieves. Identifying a generic yet robust feature differentiator across images. Any Computer Vision Scientist would try to replicate a doctor’s intuition while mapping the various underlying algorithms pursuant to the Global or Local Features. The approach can take the best of the two worlds and combining them could be the way forward to automate the requisite Doctor’s intuition. The OpenCV
has various Local Feature Extraction based Techniques like Speeded Up Robust Features (SURF), Scale Invariant Feature Transform (SIFT), Histogram of Oriented Gradients (HOG) and Local Binary Patterns (LBP). Even under Global Feature Extraction OpenCV covers Colour Histogram, Morphology (Shape Filters), Colour based Semantic Segmentation. Another emerging technique called, The Bag of Visual Features uses some combination of these Local or Global Feature Extraction technique which can eventually be taken forward for the
Classification Process using Machine Learning (using sci-kit learn library). OpenCV has function support for Bag of Visual Features as well.
One such Classification result into Normal & Covid is given in Figure IV.
The final step of the algorithm pipeline is the Image Segmentation which is highly important in analysing the criticality of the COVID patient. The degree of presence of the Pneumonic Manifestations would help the automated engine to classify the COVID Images and recommend the
necessity for Hospitalization. There are various Image Segmentation Algorithms that Open CV encompasses. Few examples include Gabor Filter, Entropy based Filtering or Colour Based Semantic Segmentation. Further to this a score can be assigned as per the Colour Gradient of the Heat map indicating the affected area of the Lungs. This would finally help to bucket the Covid into its severity
categories. One such COVID Severity Categorization (Critical Category) result is incorporated in Figure V.
In addition to the OpenCV Package, additional packages to accomplish the task of Image Classification was catered by a separate Library for Machine Learning Algorithms (using sci-kit learn library) and to highlight the above affected area for the Segmentation Algorithm required the use of Visualization Library Package (Seaborn Library) which has Heat maps. Two key take-aways are
❖ The entire implementation is possible in the Python environment.
❖ OpenCV participates in couple of essential steps important for Classification & eventual Clustering process for isolation of the Covid Patients.
❖ Requirement of Hospitalization for the COVID-19 Subjects given in Table I
Future Directions An AI based screening process for Automated Detection of COVID-19 subjects can very well form a part of our Healthcare Informatics Systems in various Clinics / Hospitals / Nursing Homes and other Primary and / or Secondary point of care facilities, to Predict and Correctly identify the COVID -19 Subjects, and at the same time isolate and determine the necessity for Hospitalization. The various
medical vitals which can form as an input Feed may include Patients Symptoms (Raw Data as Text), Pathological Data like Leukocyte, Lymphocyte & Neutrophil count forming a Tabular Data set, Medical Vitals such as Temperature & Saturation percentage of Oxygen forming a Tabular Data again, Radiography Image like X-Ray & CT – Scan forming the Image Data Set. The Future Direction of Automated Medical Diagnosis (AI based) can be in an Ensemble based Machine Learning form, collecting Prediction Results from multiple Machine Learning Frameworks and deciding through a Majority Voting Mechanism. The Hospitals can even consider to combine the power of IoT with this
Machine Learning Technique to not only detect but isolate the COVID-19 subjects following Standard Operating Procedures hence to stop the further spread of the disease as well. The activation of a Geo-Fence along with an alarm can be used to achieve this objective. A simple pictorial representation of the workflow pipeline is given below in Figure VI.
Figure VI: COVID Multi-Stage Classification with Hospital Requirement Prediction and Isolation This particular blog is to give you an idea the power of Artificial Intelligence / Machine Learning Algorithm in practical scenario. Moreover, if combined with IoT then the composite system can be leveraged to address some of the few challenging objectives which either of the two technologies in isolation would have failed to achieve. The above used Machine Learning Algorithm or the Fuzzy Inference System or the IoT enabled Geo-Fence Activation can be covered in some of the future blogs on request from readers only.
Name of Author: DEBAJIT SEN
E-mail id: sen.debajit@gmail.com
debajit_sen@outlook.com
Author Biography:
DEBAJIT SEN has close to 12 years’ experience in the field of Robotic Process Automation, Artificial Intelligence & Data Science, Machine
& Deep Learning, Signal & Image Processing and Computer Vision applications. He has implemented numerous projects in the space of Manufacturing, Security & Surveillance and Biomedical & Healthcare Domain. Currently working as a Data Scientist in a Healthcare Organization. Presently, working on the development of RPA based Intelligent Content Processing (OCR & NLP based) for
Healthcare Insurance Analytics encompassing automation of Claim Forms Processing, Electronic Health Record Processing, Prior-Authorization, Dispute and Appeals & Grievances. He has worked as an Independent Researcher / Consultant collaborating for various R & D Projects with different Institutes. He has also published numerous papers in various National & International Journals, Conferences of repute & also has an International Innovation Patent Grant from the Australian Government (IP Australia) to his name for a novel work in the domain of medical science.