In order to speed up spreadsheet development productivity, end users can create a spreadsheet table by copying and modifying an existing one. These two tables share the similar computational semantics, and form a table clone. End users may modify the tables in a table clone, e.g., adding new rows and deleting columns, thus introducing structure changes into the table clone. Our empirical study on real-world spreadsheets shows that about 58.5% of table clones involve structure changes. However, existing table clone detection approaches in spreadsheets can only detect table clones with the same structures. Therefore, many table clones with structure changes cannot be detected.
We observe that, although the tables in a table clone may be modified, they usually share the similar structures and formats, e.g., headers, formulas and background colors. Based on this observation, we propose LTC (Learning to detect Table Clones), to automatically detect table clones with or without structure changes. LTC utilizes the structure and format information from labeled table clones and non table clones to train a binary classifier. LTC first identifies tables in spreadsheets, and then uses the trained binary classifier to judge whether every two tables can form a table clone. Our experiments on real-world spreadsheets from the EUSES and Enron corpora show that, LTC can achieve a precision of 97.8% and recall of 92.1% in table clone detection, significantly outperforming the state-of-the-art technique (a precision of 37.5% and recall of 11.1%).
State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences & University of Chinese Academy of Sciences
Wed 22 Jul Times are displayed in time zone: Tijuana, Baja California change
14:50 - 16:10
|Discovering Discrepancies in Numerical Libraries|
Jackson VanoverUniversity of California, Davis, Xuan DengUniversity of California, Davis, Cindy Rubio-GonzálezUniversity of California, DavisDOI Media Attached
|Testing High Performance Numerical Simulation Programs: Experience, Lessons Learned, and Open Issues|
Technical PapersDOI Media Attached
|Functional Code Clone Detection with Syntax and Semantics Fusion Learning|
Chunrong FangNanjing University, Zixi LiuNanjing University, Yangyang Shi, Jeff HuangTexas A&M University, Qingkai ShiThe Hong Kong University of Science and TechnologyDOI Media Attached
|Learning to Detect Table Clones in Spreadsheets|
Yakun ZhangInstitute of software, Chinese academy of sciences, Wensheng DouInstitute of Software, Chinese Academy of Sciences, Jiaxin ZhuInstitute of Software at Chinese Academy of Sciences, China, Liang Xu, Zhiyong ZhouInstitute of Software, Chinese Academy of Sciences, Jun WeiState Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences & University of Chinese Academy of Sciences, Dan Ye, Bo YangDOI Media Attached