Skip to main content
Regenstrief Center for Healthcare Engineering


Core Faculty

Fatemeh Rouzbeh, Computer Science Ph.D. Student.

Data Hub

The proliferation of sensor technologies and advancements in data collection methods have enabled the accumulation of very large amounts of data. Increasingly, these datasets are considered for scientific research. However, the design of the system architecture to achieve high performance in terms of parallelization, query processing time, aggregation of heterogeneous data types (e.g., time series, images, structured data, among others), and difficulty in reproducing scientific research remain a major challenge. This is specifically true for health sciences research, where the systems must be: i) easy to use with the flexibility to manipulate data at the most granular level, ii) agnostic of programming language kernel, iii) scalable, and iv) compliant with the HIPAA privacy law.

To meet this challenge, RCHE research scientist Mohammad Adibuzzaman, RCHE faculty affiliate Ananth Grama, and RCHE-funded computer science Ph.D. student Fatemeh Rouzbeh (shown here) developed and implemented a novel architecture for software-hardware-data ecosystem over the past year using open source technologies in a distributed environment. The platform consists of four layers: storage, computation, operation, and application. The storage layer handles the data and indexes. The computation layer is responsible for the distributed computations. The operation layer supports a programming language interface to develop reusable components to analyze and process the data. Finally, the application layer offers multiple ways of interaction for users.  The system supports several types of data sources including images, structured data (such as electronic health records and claims), waveform data (such as from ECGs or smartwatches), and clinical notes.