Abstract
Visualisations drive all aspects of the Machine Learning (ML) Development Cycle but remain a vastly untapped resource by the research community. ML testing is a highly interactive and cognitive process which demands a human-in-the-loop approach. Besides writing tests for the code base, bulk of the evaluation requires application of domain expertise to generate and interpret visualisations. To gain a deeper insight into the process of testing ML systems, we propose to study visualisations of ML pipelines by mining Jupyter notebooks. We propose a two prong approach in conducting the analysis. First, gather general insights and trends using a qualitative study of a smaller sample of notebooks. And then use the knowledge gained from the qualitative study to design an empirical study using a larger sample of notebooks. Computational notebooks provide a rich source of information in three formats - text, code and images. We hope to utilise existing work in image analysis and Natural Language Processing for text and code, to analyse the information present in notebooks. We hope to gain a new perspective into program comprehension and debugging in the context of ML testing.
Original language | English |
---|---|
Title of host publication | Proceedings - 2023 IEEE/ACM 2nd International Conference on AI Engineering - Software Engineering for AI, CAIN 2023 |
Publisher | IEEE |
Pages | 117-118 |
Number of pages | 2 |
ISBN (Electronic) | 9798350301137 |
DOIs | |
Publication status | Published - 2023 |
Event | 2nd IEEE/ACM International Conference on AI Engineering - Software Engineering for AI, CAIN 2023 - Melbourne, Australia Duration: 15 May 2023 → 16 May 2023 |
Publication series
Name | Proceedings - 2023 IEEE/ACM 2nd International Conference on AI Engineering - Software Engineering for AI, CAIN 2023 |
---|
Conference
Conference | 2nd IEEE/ACM International Conference on AI Engineering - Software Engineering for AI, CAIN 2023 |
---|---|
Country/Territory | Australia |
City | Melbourne |
Period | 15/05/23 → 16/05/23 |
Bibliographical note
Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-careOtherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.
Keywords
- AI Engineering
- Computational Notebooks
- Data Mining
- Image Analysis
- Machine Learning Testing
- Natural Language Processing
- NLP for Code