Data Curation for Preclinical and Clinical Multimodal Imaging Studies

Grace Gyamfuah Yamoah, Liji Cao, Chao Wu Wu, Freek J. Beekman, Bert Vandeghinste, Julia G. Mannheim, Stefanie Rosenhain, Kevin Leonardic, Fabian Kiessling, Felix Gremse*

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

3 Citations (Scopus)
100 Downloads (Pure)


Purpose: In biomedical research, imaging modalities help discover pathological mechanisms to develop and evaluate novel diagnostic and theranostic approaches. However, while standards for data storage in the clinical medical imaging field exist, data curation standards for biomedical research are yet to be established. This work aimed at developing a free secure file format for multimodal imaging studies, supporting common in vivo imaging modalities up to five dimensions as a step towards establishing data curation standards for biomedical research. Procedures: Images are compressed using lossless compression algorithm. Cryptographic hashes are computed on the compressed image slices. The hashes and compressions are computed in parallel, speeding up computations depending on the number of available cores. Then, the hashed images with digitally signed timestamps are cryptographically written to file. Fields in the structure, compressed slices, hashes, and timestamps are serialized for writing and reading from files. The C++ implementation is tested on multimodal data from six imaging sites, well-documented, and integrated into a preclinical image analysis software. Results: The format has been tested with several imaging modalities including fluorescence molecular tomography/x-ray computed tomography (CT), positron emission tomography (PET)/CT, single-photon emission computed tomography/CT, and PET/magnetic resonance imaging. To assess performance, we measured the compression rate, ratio, and time spent in compression. Additionally, the time and rate of writing and reading on a network drive were measured. Our findings demonstrate that we achieve close to 50 % reduction in storage space for μCT data. The parallelization speeds up the hash computations by a factor of 4. We achieve a compression rate of 137 MB/s for file of size 354 MB. Conclusions: The development of this file format is a step to abstract and curate common processes involved in preclinical and clinical multimodal imaging studies in a standardized way. This work also defines better interface between multimodal imaging modalities and analysis software.

Original languageEnglish
Pages (from-to)1034-1043
Number of pages10
JournalMolecular Imaging and Biology
Issue number6
Publication statusPublished - 2019


  • Compression
  • Credibility
  • Cryptographic hashing
  • Data curation
  • File format
  • Metadata
  • Multimodal imaging
  • Reproducibility
  • Serialization
  • Timestamp


Dive into the research topics of 'Data Curation for Preclinical and Clinical Multimodal Imaging Studies'. Together they form a unique fingerprint.

Cite this