TY - JOUR
T1 - Semi-supervised empirical Bayes group-regularized factor regression
AU - Münch, Magnus M.
AU - van de Wiel, Mark A.
AU - van der Vaart, Aad W.
AU - Peeters, Carel F.W.
PY - 2022
Y1 - 2022
N2 - The features in a high-dimensional biomedical prediction problem are often well described by low-dimensional latent variables (or factors). We use this to include unlabeled features and additional information on the features when building a prediction model. Such additional feature information is often available in biomedical applications. Examples are annotation of genes, metabolites, or p-values from a previous study. We employ a Bayesian factor regression model that jointly models the features and the outcome using Gaussian latent variables. We fit the model using a computationally efficient variational Bayes method, which scales to high dimensions. We use the extra information to set up a prior model for the features in terms of hyperparameters, which are then estimated through empirical Bayes. The method is demonstrated in simulations and two applications. One application considers influenza vaccine efficacy prediction based on microarray data. The second application predicts oral cancer metastasis from RNAseq data.
AB - The features in a high-dimensional biomedical prediction problem are often well described by low-dimensional latent variables (or factors). We use this to include unlabeled features and additional information on the features when building a prediction model. Such additional feature information is often available in biomedical applications. Examples are annotation of genes, metabolites, or p-values from a previous study. We employ a Bayesian factor regression model that jointly models the features and the outcome using Gaussian latent variables. We fit the model using a computationally efficient variational Bayes method, which scales to high dimensions. We use the extra information to set up a prior model for the features in terms of hyperparameters, which are then estimated through empirical Bayes. The method is demonstrated in simulations and two applications. One application considers influenza vaccine efficacy prediction based on microarray data. The second application predicts oral cancer metastasis from RNAseq data.
KW - empirical Bayes
KW - factor regression
KW - high-dimensional data
KW - semisupervised learning
UR - http://www.scopus.com/inward/record.url?scp=85132354283&partnerID=8YFLogxK
U2 - 10.1002/bimj.202100105
DO - 10.1002/bimj.202100105
M3 - Article
AN - SCOPUS:85132354283
SN - 0323-3847
VL - 64
SP - 1289
EP - 1306
JO - Biometrical Journal
JF - Biometrical Journal
IS - 7
ER -