TY - GEN
T1 - What Should You Know? A Human-In-the-Loop Approach to Unknown Unknowns Characterization in Image Recognition
AU - Sharifi Noorian, Shahin
AU - Qiu, Sihang
AU - Gadiraju, Ujwal
AU - Yang, Jie
AU - Bozzon, Alessandro
PY - 2022
Y1 - 2022
N2 - Unknown unknowns represent a major challenge in reliable image recognition. Existing methods mainly focus on unknown unknowns identification, leveraging human intelligence to gather images that are potentially difficult for the machine. To drive a deeper understanding of unknown unknowns and more effective identification and treatment, this paper focuses on unknown unknowns characterization. We introduce a human-in-the-loop, semantic analysis framework for characterizing unknown unknowns at scale. We engage humans in two tasks that specify what a machine should know and describe what it really knows, respectively, both at the conceptual level, supported by information extraction and machine learning interpretability methods. Data partitioning and sampling techniques are employed to scale out human contributions in handling large data. Through extensive experimentation on scene recognition tasks, we show that our approach provides a rich, descriptive characterization of unknown unknowns and allows for more effective and cost-efficient detection than the state of the art.
AB - Unknown unknowns represent a major challenge in reliable image recognition. Existing methods mainly focus on unknown unknowns identification, leveraging human intelligence to gather images that are potentially difficult for the machine. To drive a deeper understanding of unknown unknowns and more effective identification and treatment, this paper focuses on unknown unknowns characterization. We introduce a human-in-the-loop, semantic analysis framework for characterizing unknown unknowns at scale. We engage humans in two tasks that specify what a machine should know and describe what it really knows, respectively, both at the conceptual level, supported by information extraction and machine learning interpretability methods. Data partitioning and sampling techniques are employed to scale out human contributions in handling large data. Through extensive experimentation on scene recognition tasks, we show that our approach provides a rich, descriptive characterization of unknown unknowns and allows for more effective and cost-efficient detection than the state of the art.
KW - humans in the loop
KW - semantic analysis
KW - Unknown unknowns
UR - http://www.scopus.com/inward/record.url?scp=85129806172&partnerID=8YFLogxK
U2 - 10.1145/3485447.3512040
DO - 10.1145/3485447.3512040
M3 - Conference contribution
AN - SCOPUS:85129806172
T3 - WWW 2022 - Proceedings of the ACM Web Conference 2022
SP - 882
EP - 892
BT - WWW 2022 - Proceedings of the ACM Web Conference 2022
PB - Association for Computing Machinery (ACM)
T2 - 31st ACM World Wide Web Conference, WWW 2022
Y2 - 25 April 2022 through 29 April 2022
ER -