Questions for Data Scientists in Software Engineering: A Replication

Hennie Huijgens, Ayushi Rastogi, Ernst Mulders, Georgios Gousios, Arie van Deursen

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

5 Citations (Scopus)
101 Downloads (Pure)


In 2014, a Microsoft study investigated the sort of questions that data science applied to software engineering should answer. This resulted in 145 questions that developers considered relevant for data scientists to answer, thus providing a research agenda to the community. Fast forward to five years, no further studies investigated whether the questions from the software engineers at Microsoft hold for other software companies, including software-intensive companies with different primary focus (to which we refer as software-defined enterprises). Furthermore, it is not evident that the problems identified five years ago are still applicable, given the technological advances in software engineering. This paper presents a study at ING, a software-defined enterprise in banking in which over 15,000 IT staff provides in-house software solutions. This paper presents a comprehensive guide of questions for data scientists selected from the previous study at Microsoft along with our current work at ING. We replicated the original Microsoft study at ING, looking for questions that impact both software companies and software-defined enterprises and continue to impact software engineering. We also add new questions that emerged from differences in the context of the two companies and the five years gap in between. Our results show that software engineering questions for data scientists in the software-defined enterprise are largely similar to the software company, albeit with exceptions. We hope that the software engineering research community builds on the new list of questions to create a useful body of knowledge.
Original languageEnglish
Title of host publicationProceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
EditorsPrem Devanbu, Myra Cohen, Thomas Zimmermann
Place of PublicationNew York, NY, USA
PublisherAssociation for Computing Machinery (ACM)
Number of pages12
ISBN (Electronic)9781450370431
ISBN (Print)978-1-4503-7043-1
Publication statusPublished - 2020

Publication series

NameESEC/FSE 2020
PublisherAssociation for Computing Machinery


  • Data Science
  • Software Analytics
  • Software Engineering


Dive into the research topics of 'Questions for Data Scientists in Software Engineering: A Replication'. Together they form a unique fingerprint.

Cite this