A repository of Unix history and evolution

Research output: Contribution to journalArticleScientificpeer-review

10 Citations (Scopus)


The history and evolution of the Unix operating system is made available as a revision management repository, covering the period from its inception in 1972 as a five thousand line kernel, to 2016 as a widely-used 27 million line system. The 1.1gb repository contains 496 thousand commits and 2,523 branch merges. The repository employs the commonly used Git version control system for its storage, and is hosted on the popular GitHub archive. It has been created by synthesizing with custom software 24 snapshots of systems developed at Bell Labs, the University of California at Berkeley, and the 386bsd team, two legacy repositories, and the modern repository of the open source Freebsd system. In total, 973 individual contributors are identified, the early ones through primary research. The data set can be used for empirical research in software engineering, information systems, and software archaeology.

Original languageEnglish
Pages (from-to)1372-1404
Number of pages33
JournalEmpirical Software Engineering
Issue number3
Publication statusPublished - 1 Jun 2017
Externally publishedYes


  • Configuration management
  • Git
  • Software archeology
  • Unix

Fingerprint Dive into the research topics of 'A repository of Unix history and evolution'. Together they form a unique fingerprint.

Cite this