Analyzing Linux on a Supercomputer

Diomidis Spinellis*

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

Abstract

The C preprocessor, a key element of the language, has become a liability due to its lack of integration with modern language semantics. This column describes the analysis of the C preprocessor usage in the Linux kernel, comprising 20 million lines of code, using the CScout refactoring browser. Processing limitations led to a solution leveraging a supercomputer’s parallel processing capabilities. The analysis divided the kernel’s source files across 32 supercomputer nodes and implemented a binary tournament database merging strategy. Initial efforts revealed multiple difficulties. Resolving them involved several false starts involving recursive SQL statements, an SQLite extension, and the GraphViz connected components tool. After a number of redesigns guided by stress-testing, the analysis finished in just 32 hours rather than a week, using 374 CPU hours and 640 GiB RAM on the supercomputer’s nodes.

Original languageEnglish
Pages (from-to)18-23
Number of pages6
JournalIEEE Software
Volume42
Issue number2
DOIs
Publication statusPublished - 2025

Bibliographical note

Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care

Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

Fingerprint

Dive into the research topics of 'Analyzing Linux on a Supercomputer'. Together they form a unique fingerprint.

Cite this