TY - JOUR
T1 - 1H-NMR metabolomics-based surrogates to impute common clinical risk factors and endpoints
AU - Bizzarri, D.
AU - Reinders, M. J.T.
AU - Beekman, M.
AU - Slagboom, P. E.
AU - van den Akker, E. B.
PY - 2022
Y1 - 2022
N2 - Background: Missing or incomplete phenotypic information can severely deteriorate the statistical power in epidemiological studies. High-throughput quantification of small-molecules in bio-samples, i.e. ‘metabolomics’, is steadily gaining popularity, as it is highly informative for various phenotypical characteristics. Here we aim to leverage metabolomics to impute missing data in clinical variables routinely assessed in large epidemiological and clinical studies. Methods: To this end, we have employed ∼26,000 1H-NMR metabolomics samples from 28 Dutch cohorts collected within the BBMRI-NL consortium, to create 19 metabolomics-based predictors for clinical variables, including diabetes status (AUC5-Fold CV = 0·94) and lipid medication usage (AUC5-Fold CV = 0·90). Findings: Subsequent application in independent cohorts confirmed that our metabolomics-based predictors can indeed be used to impute a wide array of missing clinical variables from a single metabolomics data resource. In addition, application highlighted the potential use of our predictors to explore the effects of totally unobserved confounders in omics association studies. Finally, we show that our predictors can be used to explore risk factor profiles contributing to mortality in older participants. Interpretation: To conclude, we provide 1H-NMR metabolomics-based models to impute clinical variables routinely assessed in epidemiological studies and illustrate their merit in scenarios when phenotypic variables are partially incomplete or totally unobserved. Funding: BBMRI-NL, X-omics, VOILA, Medical Delta and the Dutch Research Council (NWO-VENI).
AB - Background: Missing or incomplete phenotypic information can severely deteriorate the statistical power in epidemiological studies. High-throughput quantification of small-molecules in bio-samples, i.e. ‘metabolomics’, is steadily gaining popularity, as it is highly informative for various phenotypical characteristics. Here we aim to leverage metabolomics to impute missing data in clinical variables routinely assessed in large epidemiological and clinical studies. Methods: To this end, we have employed ∼26,000 1H-NMR metabolomics samples from 28 Dutch cohorts collected within the BBMRI-NL consortium, to create 19 metabolomics-based predictors for clinical variables, including diabetes status (AUC5-Fold CV = 0·94) and lipid medication usage (AUC5-Fold CV = 0·90). Findings: Subsequent application in independent cohorts confirmed that our metabolomics-based predictors can indeed be used to impute a wide array of missing clinical variables from a single metabolomics data resource. In addition, application highlighted the potential use of our predictors to explore the effects of totally unobserved confounders in omics association studies. Finally, we show that our predictors can be used to explore risk factor profiles contributing to mortality in older participants. Interpretation: To conclude, we provide 1H-NMR metabolomics-based models to impute clinical variables routinely assessed in epidemiological studies and illustrate their merit in scenarios when phenotypic variables are partially incomplete or totally unobserved. Funding: BBMRI-NL, X-omics, VOILA, Medical Delta and the Dutch Research Council (NWO-VENI).
KW - H-NMR metabolomics
KW - Association studies
KW - Epidemiology
KW - Missing values
KW - Regression models
KW - Surrogate clinical variables
UR - http://www.scopus.com/inward/record.url?scp=85121464298&partnerID=8YFLogxK
U2 - 10.1016/j.ebiom.2021.103764
DO - 10.1016/j.ebiom.2021.103764
M3 - Article
AN - SCOPUS:85121464298
VL - 75
JO - EBioMedicine
JF - EBioMedicine
SN - 2352-3964
M1 - 103764
ER -