3 Trillion Values in AtriumDB

Today we hit a milestone with our database high frequency physiological signals. AtriumDB now contains over three trillion rows!

This coincides with the publication of our paper describing the database – A practical approach to storage and retrieval of high-frequency physiological signals.

Objective: Storage of physiological waveform data for retrospective analysis presents significant challenges. Resultant data can be very large, and therefore becomes expensive to store and complicated to manage. Traditional database approaches are not appropriate for large scale storage of physiological waveforms. Our goal was to apply modern time series compression and indexing techniques to the problem of physiological waveform storage and retrieval.

Approach: We deployed a vendor-agnostic data collection system and developed domain-specific compression approaches that allowed long term storage of physiological waveform data and other associated clinical and medical device data. The database (called AtriumDB) also facilitates rapid retrieval of retrospective data for high-performance computing and machine learning applications.

Main results: A prototype system has been recording data in a 42-bed pediatric critical care unit at The Hospital for Sick Children in Toronto, Ontario since February 2016. As of December 2019, the database contains over 720,000 patient-hours of data collected from over 5300 patients, all with complete waveform capture. One year of full resolution physiological waveform storage from this 42-bed unit can be losslessly compressed and stored in less than 300 GB of disk space. Retrospective data can be delivered to analytical applications at a rate of up to 50 million time-value pairs per second.

Significance: Stored data are not pre-processed or filtered. Having access to a large retrospective dataset with realistic artefacts lends itself to the process of anomaly discovery and understanding. Retrospective data can be replayed to simulate a realistic streaming data environment where analytical tools can be rapidly tested at scale.

Read the paper online at