Exploring name-based bug detection in Python
Abstract
Names of source code elements provide useful contextual information about the code and
development tasks. Prior studies leverage the similarity between the names of arguments
and method parameters to detect bugs that are caused by accidentally swapping arguments
while calling methods. This requires establishing the mapping between method calls and
their definitions. However, it is a challenging task to establish the mapping because of the
complexity involved with the process (e.g., missing external libraries). This thesis aims to
understand the performance of name-based argument-related bug detection techniques in
Python, a popular general-purpose, statically typed programming language.
Towards this direction, this thesis conducts a study that first investigates the similarity
between arguments and their method parameters in Python code. The above step follows by establishing the mapping of method calls to their definitions and evaluating the
performance of existing name-based techniques to detect swapping argument-related bugs
in Python. Finally, a technique has been developed that uses argument usage patterns
and expression types in source code with name-based similarity matching to improve the
performance of detecting argument-related bugs. Evaluation of the proposed technique
with a large collection of open-source Python projects shows that the technique can detect
argument-related bugs with high accuracy even when the method definitions are missing.
One potential solution to prevent argument-related bugs from occurring is to use code completion. An argument recommendation system suggests method arguments as a developer
types the code. Thus, the second part of the thesis focuses on completing arguments of
method calls. In particular, this thesis investigates the efficacy of large language models in
recommending arguments for API (Application Programming Interface) method calls.