TY - JOUR
T1 - Vine Regression with Bayes Nets
T2 - A Critical Comparison with Traditional Approaches Based on a Case Study on the Effects of Breastfeeding on IQ
AU - Cooke, Roger M.
AU - Joe, Harry
AU - Chang, Bo
PY - 2021
Y1 - 2021
N2 - Regular vines (R-vines) copulas build high dimensional joint densities from arbitrary one-dimensional margins and (conditional) bivariate copula densities. Vine densities enable the computation of all conditional distributions, though the calculations can be numerically intensive. Saturated continuous nonparametric Bayes nets (CNPBN) are regular vines. Computing regression functions from the vine copula density is termed vine regression. The epicycles of regression–including/excluding covariates, interactions, higher order terms, multicollinearity, model fit, transformations, heteroscedasticity, bias–are dispelled. One simply computes the regressions from the vine copula density. Only the question of finding an adequate vine copula remains. Vine regression is applied to a data set from the National Longitudinal Study of Youth relating breastfeeding to IQ. The expected effects of breastfeeding on IQ depend on IQ, on the baseline level of breastfeeding, on the duration of additional breastfeeding and on the values of other covariates. A child given two weeks breastfeeding can expect to increase his/her IQ by 1.5–2 IQ points by adding 10 weeks of breastfeeding, depending on values of other covariates. A child given two years breastfeeding can expect to gain from 0.48–0.65 IQ points from 10 additional weeks. Adding 10 weeks breastfeeding to each of the 3,179 children in this data set has a net present value $50,700,000 according to the Bayes net, compared to $29,000,000 according to the linear regression.
AB - Regular vines (R-vines) copulas build high dimensional joint densities from arbitrary one-dimensional margins and (conditional) bivariate copula densities. Vine densities enable the computation of all conditional distributions, though the calculations can be numerically intensive. Saturated continuous nonparametric Bayes nets (CNPBN) are regular vines. Computing regression functions from the vine copula density is termed vine regression. The epicycles of regression–including/excluding covariates, interactions, higher order terms, multicollinearity, model fit, transformations, heteroscedasticity, bias–are dispelled. One simply computes the regressions from the vine copula density. Only the question of finding an adequate vine copula remains. Vine regression is applied to a data set from the National Longitudinal Study of Youth relating breastfeeding to IQ. The expected effects of breastfeeding on IQ depend on IQ, on the baseline level of breastfeeding, on the duration of additional breastfeeding and on the values of other covariates. A child given two weeks breastfeeding can expect to increase his/her IQ by 1.5–2 IQ points by adding 10 weeks of breastfeeding, depending on values of other covariates. A child given two years breastfeeding can expect to gain from 0.48–0.65 IQ points from 10 additional weeks. Adding 10 weeks breastfeeding to each of the 3,179 children in this data set has a net present value $50,700,000 according to the Bayes net, compared to $29,000,000 according to the linear regression.
KW - Bayes net
KW - breastfeeding
KW - copula
KW - Gaussian copula
KW - heteroscedasticity
KW - IQ
KW - multivariate regression
KW - National Longitudinal Study of Youth
KW - regression heuristics
KW - Regular vine
KW - vine copula
UR - http://www.scopus.com/inward/record.url?scp=85100831907&partnerID=8YFLogxK
U2 - 10.1111/risa.13695
DO - 10.1111/risa.13695
M3 - Article
AN - SCOPUS:85100831907
JO - Risk Analysis: an international journal
JF - Risk Analysis: an international journal
SN - 0272-4332
ER -