physician typing on computer

Photo courtesy of Pixabay.


There’s a lot of talk about patient-centered, personalized medicine these days, and rightly so: Treatment that is tailored to your genetics or your biochemistry or the molecular profile of your tumor is likely to be much more effective than a one-size-fits-all approach.

Determining the parameters for such treatment, however, requires a lot of data. Some medical events are so rare that individual clinicians may only see one or two cases over the course of their careers, and so data scientists must collect and analyze thousands of data sets from many different facilities.

Mohammad Adibuzzaman

Mohammad Adibuzzaman

Even then, their findings make no difference unless clinicians can access and apply the information – which is what Mohammad Adibuzzaman is helping to do through an open-access analysis project at the Regenstrief Center for Healthcare Engineering (RCHE) in Purdue’s Discovery Park.

Adibuzzaman, an assistant research scientist, joined RCHE two years ago after several data science positions, including a fellowship with the U.S. Food and Drug Administration through the Oak Ridge Institute of Science and Engineering. Here at RCHE, he has been collaborating with scientists at the Massachusetts Institute of Technology and a Boston-based startup called Paradigm4 to accelerate the translation of big data into healthcare decision-making.

The Laboratory of Computational Physiology at the MIT, led by Roger Mark, Distinguished Professor of Health Sciences and Technology and of electrical engineering and computer science, has built an expansive openly available database called MIMIC III, comprising de-identified health data on nearly 60,000 critical care patients captured from patient monitors and hospital medical information systems. Paradigm4, founded by Turing Award recipient and MIT professor Mike Stonebroker, has created SciDB, a powerful database management system for scientific and other applications. RCHE has launched such initiatives as REMEDI Central, which collects, analyzes and shares infusion pump data to improve patient safety; and CatalyzeCare, a virtual hub where healthcare professionals, researchers and educators can share resources and establish best practices.

Together, MIT, Paradigm4 and RCHE scientists have developed a user-friendly, open-access analysis tool that would allow clinicians to easily analyze large sets of data for improved decision-making.

“In big data, you start with preprocess, then high-performance computing, then analysis/coding, then publication and maybe then translation,” explains Adibuzzaman, who has a Ph.D. in computational sciences from Marquette University. Effective translation of clinical data, he says, typically encounters several hurdles, including a lack of communication between data scientists and clinicians.

Using MIMIC III as a test bed for their research, Adibuzzaman and his collaborators set out to overcome these communication barriers through innovative design. After transforming the compressed binary waveform database using  Bash and Python scripts to CSV and then to SCiDB software, they created a tool that provides high-level exploration and visualization of the MIMIC III database, integrates clinical and waveform data and performs complex analyses. The tool is also open-source so that any clinician or scientist can access and interact with the data, after the data use agreement pertinent with the HIPAA compliance law.

For example, as Adibuzzaman and his collaborators note in their article, “Closing the data loop: An integrated open access analysis platform for the MIMIC database,” in 2016, the FDA issued a safety announcement about the popular antidiarrheal drug Loperamide (Imodium), with concerns about serious heart problems resulting among a sample of 48 patients. The MIMIC database contains more than 2,300 prescriptions of the drug, which clinicians could analyze for data such as vital signs based on demographics or clinical condition.

“We shared all the code that has been used for this publication,” Adibuzzaman says. “Any researcher in any part of the world that wants to look at different things could then tweak the code and reproduce the entire publication work in a matter of days.”

The researchers presented findings from this beta project last fall; now, they are applying for grants from the National Institutes of Health to expand and replicate the system with the centralized data systems in Pennsylvania-based Geisinger Health. Ananth Grama, professor of computer science, is advising the team.

“The system enables sharing, but the database isn’t real-time ― someone has to manually update it,” Adibuzzaman says. “The resource that we have is a small cluster. As it become bigger and bigger, there is a cost issue. We are addressing all of these in the grand proposals.”

Writer: Angie Roberts, senior writer/designer, Research Communications,