TY - JOUR
T1 - Data Curation for Preclinical and Clinical Multimodal Imaging Studies
AU - Yamoah, Grace Gyamfuah
AU - Cao, Liji
AU - Wu, Chao Wu
AU - Beekman, Freek J.
AU - Vandeghinste, Bert
AU - Mannheim, Julia G.
AU - Rosenhain, Stefanie
AU - Leonardic, Kevin
AU - Kiessling, Fabian
AU - Gremse, Felix
PY - 2019
Y1 - 2019
N2 - Purpose: In biomedical research, imaging modalities help discover pathological mechanisms to develop and evaluate novel diagnostic and theranostic approaches. However, while standards for data storage in the clinical medical imaging field exist, data curation standards for biomedical research are yet to be established. This work aimed at developing a free secure file format for multimodal imaging studies, supporting common in vivo imaging modalities up to five dimensions as a step towards establishing data curation standards for biomedical research. Procedures: Images are compressed using lossless compression algorithm. Cryptographic hashes are computed on the compressed image slices. The hashes and compressions are computed in parallel, speeding up computations depending on the number of available cores. Then, the hashed images with digitally signed timestamps are cryptographically written to file. Fields in the structure, compressed slices, hashes, and timestamps are serialized for writing and reading from files. The C++ implementation is tested on multimodal data from six imaging sites, well-documented, and integrated into a preclinical image analysis software. Results: The format has been tested with several imaging modalities including fluorescence molecular tomography/x-ray computed tomography (CT), positron emission tomography (PET)/CT, single-photon emission computed tomography/CT, and PET/magnetic resonance imaging. To assess performance, we measured the compression rate, ratio, and time spent in compression. Additionally, the time and rate of writing and reading on a network drive were measured. Our findings demonstrate that we achieve close to 50 % reduction in storage space for μCT data. The parallelization speeds up the hash computations by a factor of 4. We achieve a compression rate of 137 MB/s for file of size 354 MB. Conclusions: The development of this file format is a step to abstract and curate common processes involved in preclinical and clinical multimodal imaging studies in a standardized way. This work also defines better interface between multimodal imaging modalities and analysis software.
AB - Purpose: In biomedical research, imaging modalities help discover pathological mechanisms to develop and evaluate novel diagnostic and theranostic approaches. However, while standards for data storage in the clinical medical imaging field exist, data curation standards for biomedical research are yet to be established. This work aimed at developing a free secure file format for multimodal imaging studies, supporting common in vivo imaging modalities up to five dimensions as a step towards establishing data curation standards for biomedical research. Procedures: Images are compressed using lossless compression algorithm. Cryptographic hashes are computed on the compressed image slices. The hashes and compressions are computed in parallel, speeding up computations depending on the number of available cores. Then, the hashed images with digitally signed timestamps are cryptographically written to file. Fields in the structure, compressed slices, hashes, and timestamps are serialized for writing and reading from files. The C++ implementation is tested on multimodal data from six imaging sites, well-documented, and integrated into a preclinical image analysis software. Results: The format has been tested with several imaging modalities including fluorescence molecular tomography/x-ray computed tomography (CT), positron emission tomography (PET)/CT, single-photon emission computed tomography/CT, and PET/magnetic resonance imaging. To assess performance, we measured the compression rate, ratio, and time spent in compression. Additionally, the time and rate of writing and reading on a network drive were measured. Our findings demonstrate that we achieve close to 50 % reduction in storage space for μCT data. The parallelization speeds up the hash computations by a factor of 4. We achieve a compression rate of 137 MB/s for file of size 354 MB. Conclusions: The development of this file format is a step to abstract and curate common processes involved in preclinical and clinical multimodal imaging studies in a standardized way. This work also defines better interface between multimodal imaging modalities and analysis software.
KW - Compression
KW - Credibility
KW - Cryptographic hashing
KW - Data curation
KW - File format
KW - Metadata
KW - Multimodal imaging
KW - Reproducibility
KW - Serialization
KW - Timestamp
UR - http://www.scopus.com/inward/record.url?scp=85062955476&partnerID=8YFLogxK
U2 - 10.1007/s11307-019-01339-0
DO - 10.1007/s11307-019-01339-0
M3 - Article
AN - SCOPUS:85062955476
SN - 1536-1632
VL - 21
SP - 1034
EP - 1043
JO - Molecular Imaging and Biology
JF - Molecular Imaging and Biology
IS - 6
ER -