How to analyze git repositories with command line tools: We're not in Kansas anymore

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

1 Citation (Scopus)


Git repositories are an important source of empirical software engineering product and process data. Running the Git command-line tool and processing its output with other Unix tools allows the incremental construction of sophisticated data processing pipelines. Git data analytics on the command-line can be systematically presented through a pattern that involves fetching, selection, processing, summarization, and reporting. For each part of the processing pipeline, we examine the tools and techniques that can be most effectively used to perform the task at hand. The presented techniques can be easily applied, first to get a feeling of version control repository data at hand and then also for extracting empirical results.

Original languageEnglish
Title of host publicationProceedings of the 40th International Conference on Software Engineering, ICSE '18
Subtitle of host publicationCompanion Proceedings
Place of PublicationNew York, NY
PublisherAssociation for Computing Machinery (ACM)
Number of pages2
VolumePart F137351
ISBN (Electronic)978-1-4503-5663-3
Publication statusPublished - 2018
EventICSE 2018: 40th International Conference on Software Engineering - Gothenburg, Sweden
Duration: 27 May 20183 Jun 2018
Conference number: 40


ConferenceICSE 2018
Internet address


  • Command-line tools
  • Data analytics
  • empirical software engineering
  • Git
  • Pipes and filters


Dive into the research topics of 'How to analyze git repositories with command line tools: We're not in Kansas anymore'. Together they form a unique fingerprint.

Cite this