A dataset for pull-based development research

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

48 Citations (Scopus)

Abstract

Pull requests form a new method for collaborating in distributed software development. To study the pull request distributed development model, we constructed a dataset of almost 900 projects and 350,000 pull requests, including some of the largest users of pull requests on Github. In this paper, we describe how the project selection was done, we analyze the selected features and present a machine learning tool set for the R statistics environment.

Original languageEnglish
Title of host publication11th Working Conference on Mining Software Repositories, MSR 2014 - Proceedings
PublisherACM
Pages368-371
Number of pages4
ISBN (Electronic)9781450328630
DOIs
Publication statusPublished - 31 May 2014
Event11th International Working Conference on Mining Software Repositories, MSR 2014 - Hyderabad, India
Duration: 31 May 20141 Jun 2014

Publication series

Name11th Working Conference on Mining Software Repositories, MSR 2014 - Proceedings

Conference

Conference11th International Working Conference on Mining Software Repositories, MSR 2014
Country/TerritoryIndia
CityHyderabad
Period31/05/141/06/14

Keywords

  • Distributed software development
  • Empirical software engineering
  • Pull request
  • Pull-based development

Fingerprint

Dive into the research topics of 'A dataset for pull-based development research'. Together they form a unique fingerprint.

Cite this