Context: Replication studies and experiments form an important foundation in advancing scientific research. While their prevalence in Software Engineering is increasing, there is still more to be done. Objective: This article aims to extend our previous replication study on search-based test generation techniques by performing a large-scale empirical comparison with further techniques from state of the art. Method: We designed a comprehensive experimental study involving six techniques, a benchmark composed of 180 non-trivial Java classes, and a total of 21,600 independent executions. Metrics regarding effectiveness and efficiency of the techniques were collected and analyzed by means of statistical methods. Results: Our empirical study shows that single target approaches are generally outperformed by multi-target approaches, while within the multi-target approaches, DynaMOSA/ MOSA, which are based on many-objective optimization, outperform the others, in particular for complex classes. Conclusion: The results obtained from our large-scale empirical investigation confirm what has been reported in previous studies, while also highlighting striking differences and novel observations. Future studies, on different benchmarks and considering additional techniques, could further reinforce and extend our findings.
- Test case generation
- Search-Based Software Testing
- Large-scale evaluation
- Evolutionary computation