An Empirical Study of Data Constraint Implementations in Java
Software systems are designed according to guidelines and constraints defined by business rules. Some of these constraints define the allowable or required values for data handled by the systems. These data constraints usually originate from the problem domain (e.g., regulations), and developers must write code that enforces them. Understanding how data constraints are implemented is essential for testing, debugging, and software change. Unfortunately, there are no widely-accepted guidelines or best practices on how to implement data constraints.
This paper presents an empirical study that investigates how data constraints are implemented in Java. We study the implementation of 187 data constraints extracted from the documentation of eight real-world Java software systems. First, we perform a qualitative analysis of the textual description of data constraints and identify four data constraint types. Second, we manually identify the implementations of these data constraints and reveal that they can be grouped into 31 implementation patterns. The analysis of these implementation patterns indicates that developers prefer a handful of patterns when implementing data constraints. We also found evidence suggesting that deviations from these patterns are associated with unusual implementation decisions or code smells. Third, we develop a tool-assisted protocol that allows us to identify 256 additional trace links for the data constraints implemented using the 13 most common patterns. We find that almost half of these data constraints have multiple enforcing statements, which are code clones of different types. Finally, a study with 16 professional developers indicates that the patterns we describe can be easily and accurately recognized in Java code.
Thu 18 MayDisplayed time zone: Hobart change
13:45 - 15:15 | Programming languagesDEMO - Demonstrations / Technical Track / Journal-First Papers / SEET - Software Engineering Education and Training at Meeting Room 103 Chair(s): Jean-Guy Schneider Monash University | ||
13:45 15mTalk | Demystifying Issues, Challenges, and Solutions for Multilingual Software Development Technical Track Haoran Yang Washington State University, Weile Lian Washington State University, Shaowei Wang University of Manitoba, Haipeng Cai Washington State University Pre-print | ||
14:00 15mTalk | Testability Refactoring in Pull Requests: Patterns and Trends Technical Track Pre-print | ||
14:15 15mTalk | Usability-Oriented Design of Liquid Types for Java Technical Track Catarina Gamboa CMU and LASIGE, Paulo Canelas Carnegie Mellon University, Christopher Steven Timperley Carnegie Mellon University, Alcides Fonseca University of Lisbon DOI | ||
14:30 15mTalk | A Theorem Proving Approach to Programming Language Semantics SEET - Software Engineering Education and Training Subhajit Roy IIT Kanpur | ||
14:45 7mTalk | RIdiom: Automatically Refactoring Non-idiomatic Python Code with Pythonic Idioms DEMO - Demonstrations zejun zhang Australian National University, Zhenchang Xing CSIRO’s Data61; Australian National University, Xiwei (Sherry) Xu CSIRO’s Data61, Liming Zhu CSIRO’s Data61 | ||
14:52 7mTalk | An Empirical Study of Data Constraint Implementations in Java Journal-First Papers Juan Manuel Florez CQSE America, Laura Moreno CQSE America, Zenong Zhang The University of Texas at Dallas, Shiyi Wei University of Texas at Dallas, Andrian Marcus University of Texas at Dallas | ||
14:59 7mTalk | Learning To Predict User-Defined Types Journal-First Papers Kevin Jesse University of California at Davis, USA, Prem Devanbu University of California at Davis, Anand Ashok Sawant University of California, Davis |