TY - GEN
T1 - Scalable Call Graph Constructor for Maven
AU - Keshani, Mehdi
PY - 2021
Y1 - 2021
N2 - As a rich source of data, Call Graphs are used for various applications including security vulnerability detection. Despite multiple studies showing that Call Graphs can drastically improve the accuracy of analysis, existing ecosystem-scale tools like Dependabot do not use Call Graphs and work at the package-level. Using Call Graphs in ecosystem use cases is not practical because of the scalability problems that Call Graph generators have. Call Graph generation is usually considered to be a 'full program analysis' resulting in large Call Graphs and expensive computation. To make an analysis applicable to ecosystem scale, this pragmatic approach does not work, because the number of possible combinations of how a particular artifact can be combined in a full program explodes. Therefore, it is necessary to make the analysis incremental. There are existing studies on different types of incremental program analysis. However, none of them focuses on Call Graph generation for an entire ecosystem. In this paper, we propose an incremental implementation of the CHA algorithm that can generate Call Graphs on-demand, by stitching together partial Call Graphs that have been extracted for libraries before. Our preliminary evaluation results show that the proposed approach scales well and outperforms the most scalable existing framework called OPAL.
AB - As a rich source of data, Call Graphs are used for various applications including security vulnerability detection. Despite multiple studies showing that Call Graphs can drastically improve the accuracy of analysis, existing ecosystem-scale tools like Dependabot do not use Call Graphs and work at the package-level. Using Call Graphs in ecosystem use cases is not practical because of the scalability problems that Call Graph generators have. Call Graph generation is usually considered to be a 'full program analysis' resulting in large Call Graphs and expensive computation. To make an analysis applicable to ecosystem scale, this pragmatic approach does not work, because the number of possible combinations of how a particular artifact can be combined in a full program explodes. Therefore, it is necessary to make the analysis incremental. There are existing studies on different types of incremental program analysis. However, none of them focuses on Call Graph generation for an entire ecosystem. In this paper, we propose an incremental implementation of the CHA algorithm that can generate Call Graphs on-demand, by stitching together partial Call Graphs that have been extracted for libraries before. Our preliminary evaluation results show that the proposed approach scales well and outperforms the most scalable existing framework called OPAL.
KW - Logic and verification
KW - Program analysis
KW - Theory of computation
UR - http://www.scopus.com/inward/record.url?scp=85115705648&partnerID=8YFLogxK
U2 - 10.1109/ICSE-Companion52605.2021.00046
DO - 10.1109/ICSE-Companion52605.2021.00046
M3 - Conference contribution
AN - SCOPUS:85115705648
SN - 978-1-6654-1219-3
T3 - 2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: COMPANION PROCEEDINGS (ICSE-COMPANION 2021)
SP - 99
EP - 101
BT - Proceedings - 2021 IEEE/ACM 43rd International Conference on Software Engineering
A2 - O'Conner, L.
PB - IEEE
CY - Piscataway
T2 - 43rd IEEE/ACM International Conference on Software Engineering: Companion, ICSE-Companion 2021
Y2 - 25 May 2021 through 28 May 2021
ER -