In order to speed up spreadsheet development productivity, end users can create a spreadsheet table by copying and modifying an existing one. These two tables share the similar computational semantics, and form a table clone. End users may modify the tables in a table clone, e.g., adding new rows and deleting columns, thus introducing structure changes into the table clone. Our empirical study on real-world spreadsheets shows that about 58.5% of table clones involve structure changes. However, existing table clone detection approaches in spreadsheets can only detect table clones with the same structures. Therefore, many table clones with structure changes cannot be detected.
We observe that, although the tables in a table clone may be modified, they usually share the similar structures and formats, e.g., headers, formulas and background colors. Based on this observation, we propose LTC (Learning to detect Table Clones), to automatically detect table clones with or without structure changes. LTC utilizes the structure and format information from labeled table clones and non table clones to train a binary classifier. LTC first identifies tables in spreadsheets, and then uses the trained binary classifier to judge whether every two tables can form a table clone. Our experiments on real-world spreadsheets from the EUSES and Enron corpora show that, LTC can achieve a precision of 97.8% and recall of 92.1% in table clone detection, significantly outperforming the state-of-the-art technique (a precision of 37.5% and recall of 11.1%).
State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences & University of Chinese Academy of Sciences
Wed 22 JulDisplayed time zone: Tijuana, Baja California change
14:50 - 16:10
|Discovering Discrepancies in Numerical Libraries|
Jackson Vanover University of California, Davis, Xuan Deng University of California, Davis, Cindy Rubio-González University of California, DavisDOI Media Attached
|Testing High Performance Numerical Simulation Programs: Experience, Lessons Learned, and Open Issues|
Technical PapersDOI Media Attached
|Functional Code Clone Detection with Syntax and Semantics Fusion Learning|
Chunrong Fang Nanjing University, Zixi Liu Nanjing University, Yangyang Shi , Jeff Huang Texas A&M University, Qingkai Shi The Hong Kong University of Science and TechnologyDOI Media Attached
|Learning to Detect Table Clones in Spreadsheets|
Yakun Zhang Institute of software, Chinese academy of sciences, Wensheng Dou Institute of Software, Chinese Academy of Sciences, Jiaxin Zhu Institute of Software at Chinese Academy of Sciences, China, Liang Xu , Zhiyong Zhou Institute of Software, Chinese Academy of Sciences, Jun Wei State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences & University of Chinese Academy of Sciences, Dan Ye , Bo YangDOI Media Attached