TY - JOUR
T1 - Tiny, Always-on, and Fragile
T2 - Bias Propagation through Design Choices in On-device Machine Learning Workflows
AU - Hutiri, Wiebke (Toussaint)
AU - Ding, Aaron Yi
AU - Kawsar, Fahim
AU - Mathur, Akhil
PY - 2023
Y1 - 2023
N2 - Billions of distributed, heterogeneous, and resource constrained IoT devices deploy on-device machine learning (ML) for private, fast, and offline inference on personal data. On-device ML is highly context dependent and sensitive to user, usage, hardware, and environment attributes. This sensitivity and the propensity toward bias in ML makes it important to study bias in on-device settings. Our study is one of the first investigations of bias in this emerging domain and lays important foundations for building fairer on-device ML. We apply a software engineering lens, investigating the propagation of bias through design choices in on-device ML workflows. We first identify reliability bias as a source of unfairness and propose a measure to quantify it. We then conduct empirical experiments for a keyword spotting task to show how complex and interacting technical design choices amplify and propagate reliability bias. Our results validate that design choices made during model training, like the sample rate and input feature type, and choices made to optimize models, like light-weight architectures, the pruning learning rate, and pruning sparsity, can result in disparate predictive performance across male and female groups. Based on our findings, we suggest low effort strategies for engineers to mitigate bias in on-device ML.
AB - Billions of distributed, heterogeneous, and resource constrained IoT devices deploy on-device machine learning (ML) for private, fast, and offline inference on personal data. On-device ML is highly context dependent and sensitive to user, usage, hardware, and environment attributes. This sensitivity and the propensity toward bias in ML makes it important to study bias in on-device settings. Our study is one of the first investigations of bias in this emerging domain and lays important foundations for building fairer on-device ML. We apply a software engineering lens, investigating the propagation of bias through design choices in on-device ML workflows. We first identify reliability bias as a source of unfairness and propose a measure to quantify it. We then conduct empirical experiments for a keyword spotting task to show how complex and interacting technical design choices amplify and propagate reliability bias. Our results validate that design choices made during model training, like the sample rate and input feature type, and choices made to optimize models, like light-weight architectures, the pruning learning rate, and pruning sparsity, can result in disparate predictive performance across male and female groups. Based on our findings, we suggest low effort strategies for engineers to mitigate bias in on-device ML.
KW - audio keyword spotting
KW - Bias
KW - design choices
KW - embedded machine learning
KW - fairness
KW - on-device machine learning
KW - personal data
UR - http://www.scopus.com/inward/record.url?scp=85167808648&partnerID=8YFLogxK
U2 - 10.1145/3591867
DO - 10.1145/3591867
M3 - Article
AN - SCOPUS:85167808648
SN - 1049-331X
VL - 32
JO - ACM Transactions on Software Engineering and Methodology
JF - ACM Transactions on Software Engineering and Methodology
IS - 6
M1 - 155
ER -