Python 3 Types in the Wild: A Tale of Two Type Systems
Thu 19 Nov 2020 01:20 - 01:40 at SPLASH-III - 3 Chair(s): Michael Homer, Francesco Ranzato
Python 3 is a highly dynamic language, but it has introduced a syntax
for expressing types with PEP484. This paper explores how
developers use these type annotations, the type system semantics
provided by type checking and inference tools, and the performance
of these tools. We evaluate the types and tools on a corpus of public
GitHub repositories. We review MyPy and PyType, two canonical static
type checking and inference tools, and their distinct
approaches to type analysis. We then address
three research questions:
(i) How often and in what ways do developers use Python 3 types?
(ii) Which type errors do developers make?
(iii) How do type errors from different tools compare?
Surprisingly, when developers use static types, the code rarely
type-checks with either of the tools. MyPy and PyType exhibit false
positives, due to their static nature, but also flag many useful
errors in our corpus. Lastly, MyPy and PyType embody two distinct type
systems, flagging different errors in many cases.
Understanding the usage of Python types can help guide tool-builders
and researchers. Understanding the performance of popular tools can
help increase the adoption of static types and tools by practitioners,
ultimately leading to more correct and more robust Python code.