Abstract
Dynamic languages, such as Python and Javascript, trade static typing for developer flexibility and productivity. Lack of static typing can cause run-time exceptions and is a major factor for weak IDE support. To alleviate these issues, PEP 484 introduced optional type annotations for Python. As retrofitting types to existing code-bases is error-prone and laborious, machine learning (ML)-based approaches have been proposed to enable automatic type infer-ence based on existing, partially annotated codebases. However, previous ML-based approaches are trained and evaluated on human-provided type annotations, which might not always be sound, and hence this may limit the practicality for real-world usage. In this paper, we present TYPE4Py, a deep similarity learning-based hier-archical neural network model. It learns to discriminate between similar and dissimilar types in a high-dimensional space, which results in clusters of types. Likely types for arguments, variables, and return values can then be inferred through the nearest neigh-bor search. Unlike previous work, we trained and evaluated our model on a type-checked dataset and used mean reciprocal rank (MRR) to reflect the performance perceived by users. The obtained results show that TYPE4Py achieves an MRR of 77.1 %, which is a substantial improvement of 8.1% and 16.7% over the state-of-the-art approaches Typilus and Typewriter, respectively. Finally, to aid developers with retrofitting types, we released a Visual Stu-dio Code extension, which uses TYPE4Py to provide ML-based type auto-completion for Python.
Original language | English |
---|---|
Title of host publication | Proceedings - 2022 ACM/IEEE 44th International Conference on Software Engineering, ICSE 2022 |
Publisher | IEEE |
Pages | 2241-2252 |
Number of pages | 12 |
ISBN (Electronic) | 978-1-4503-9221-1 |
DOIs | |
Publication status | Published - 2022 |
Event | 44th ACM/IEEE International Conference on Software Engineering, ICSE 2022: Software Engineering in Practice (ICSE-SEIP) - Pittsburgh, United States Duration: 22 May 2022 → 27 May 2022 Conference number: 44th |
Publication series
Name | Proceedings - International Conference on Software Engineering |
---|---|
Volume | 2022-May |
ISSN (Print) | 0270-5257 |
Conference
Conference | 44th ACM/IEEE International Conference on Software Engineering, ICSE 2022 |
---|---|
Abbreviated title | ICSE 2022 |
Country/Territory | United States |
City | Pittsburgh |
Period | 22/05/22 → 27/05/22 |
Keywords
- Machine Learning
- Mean Reciprocal Rank
- Python
- Similarity Learning
- Type Inference