Typed and Confused: The Unexpected Dangers of Gradual Typing
In recent years, scripting languages such as JavaScript and Python have gained a lot of traction due to their flexibility, allowing developers to write concise code in a short amount of time. However, this flexibility is achieved via weak, dynamic typing, which fails to catch subtle runtime bugs that would be prevented by a compiler, in static typing.Gradual type systems like TypeScript emerged as a solutions that combines the best of both worlds, allowing developers to annotate arbitrary amounts of their codebase with optional type hints. Nonetheless, most practical deployments of such systems are unsound. That is, they sacrifice type safety for performance by limiting themselves to static checks and not performing any residual runtime checks to enforce the type hints uniformly. This is a missed automation opportunity, which puts the burden on the users, who still need to perform explicit type checks at transition points between untyped and typed code to guarantee that values at runtime obey the type hints. Failure to do so can result in subtle bugs caused by type inconsistencies and, when user input is involved, it can render input validation mechanisms ineffective, resulting in type confusion problems. In this work, we aim to study the relation between gradual typing and type confusion. Our main hypothesis is that the type hints in the code can mislead developers into thinking that they are enforced consistently by the compiler, resulting in a lack of explicit runtime checks that ensure type safety. We perform a large empirical study with 30,000 open-source repositories containing JavaScript, TypeScript and Python code. We statically analyze if and how they use gradual typing and to what extent this influences the presence of explicit type checks. We find that gradual typing is at the same time widely, but not extensively used, meaning that many projects feature gradually typed code, but usually only in small portions of the codebase. This implies that there are many points in the code base where developers need to add explicit type checks, i.e., at the transition points between unanotated and annotated code. Our results further indicate that gradual typing may have a deteriorating effect on type checking practices, in particular when primitive values are involved. Finally, we manually analyze a small portion of the studied repositories and show that attackers can cause type confusion in popular open-source web applications and, thus, violate the type hints added by developers. We hope that our results help raise awareness about the limits of current gradual type systems and their unwanted effect on input validation.