Wednesday, March 28, 2018

Xanadu Based Medical Big Data CBIR System for Automated Diseases Diagnosis

For some diseases, assessment of historical medical records in the database is sufficient for quick and accurate diagnosis. Many of such historical medical records are in form of images, such as images of affected body parts of the patients indicating a disease or abnormality. The doctors can study new images by comparing them to similar images available in the database. Moreover, since such historical records are generally stored along with their corresponding diagnoses in the database, it becomes easier for the doctors to diagnose a patient.

As the number of records in the database increases, the database may become comprehensive and exhaustive resulting into a consequent improvement in accuracy of a diagnosis based on the database. However, handling of such large amount of data poses a challenge. It is difficult to implement an architecture that enables archiving of such large number of records that allows quick retrieval of relevant records on demand at low cost.

Xanadu is a big data management technology. Xanadu provides resilient, durable, scalable, and consistent distributed big data database. Xanadu enables competitive big data management in the clouds or enterprises.
Xanadu’s high scalability makes it an ideal choice for the above mentioned type of problem. Xanadu distributed hash means thousands of images can be stored and retrieved efficiently using commodity hardware with a very low cost per GB. Xanadu’s query system can also be leveraged to retrieve images quickly even when the total images in the database grows into the millions, or even billions. For details regarding Xanadu, please see following references.

A content based image retrieval (CBIR) system enables search and retrieval of images similar to a target image in large databases base on contents of images (e.g. colors, shapes, textures). A common use-case of CBIR in medical diagnosis is where imaging methods are used to highlight small areas (lesions) in otherwise healthy tissue. Early breast cancer can be seen as small shadows on a Mamogram (X-Ray of the breast), PET scans highlight small areas of increased metabolic activity that can characterize cancerous growths and Retinal images show small bleeds (microaneurysms) that highlight eye disease as well as wider metabolic problems such as type II diabetes.

To demonstrate the performance and capability of Xanadu based medical big data CBIR system, a prototype CBIR system for retinal images is developed. In the case of the retina (the light sensing surface at the rear of the eye), a simple non-invasive photograph of the eye is sufficient to determine whether a patient suffers from a range of diseases. Indeed, the signs of other diseases such as diabetes, high blood pressure and other circulatory disorders can be diagnosed and assessed from a single retinal image.

The prototype CBIR system collects each retinal image together with its expert reviewed diagnosis. Then, the system breaks each image into small patches that are as small as possible without losing the ability to contain the typical lesions that can indicate disease. The system utilizes the PCA technique to cluster images together in a way that naturally groups similar images. For improving the accuracy, the system uses machine learning techniques (e.g. random forest or deep neural networks). The machine learning techniques also enable to infer an overall diagnosis of a patch given the disease “score” and “distance” (in image pixel terms) from known examples. The result is a detailed (pixel by pixel) report for doctors where all areas of concern have been highlighted and a detailed list of comparative images (showing the similarities) can be viewed in the graphic user interface for final clinical assessment.

If you are interested in collaboration regarding medical big data CBIR applications utilizing medical big data archive, please let me know: Alex G. Lee (

No comments: