Semantic Table Structure Identification in Spreadsheets
Sat 17 Jul 2021 10:10 - 10:30 at ISSTA 1 - Session 27 (time band 3) Bugs and Analysis 2 Chair(s): Mike Papadakis
Spreadsheets are widely used in various business tasks, and contain
amounts of valuable data. However, spreadsheet tables are usually
organized in a semi-structured way, and contain complicated semantic
structures, e.g., header types and relations among headers. Lack of
documented semantic table structures, existing data analysis and
error detection tools can hardly understand spreadsheet tables. Therefore,
identifying semantic table structures in spreadsheet tables is of
great importance, and can
greatly promote various analysis tasks on spreadsheets.
In this paper, we propose Tasi (Table
structure identification) to
automatically identify semantic table structures in spreadsheets.
Based on the contents, styles, and spatial locations in table
headers, Tasi adopts a multi-classifier to predict potential header
types and relations, and then integrates all header types and
relations into consistent semantic table structures. We further
propose TasiError, to detect spreadsheet errors based on the
identified semantic table structures by Tasi. Our experiments on
real-world spreadsheets show that, Tasi can precisely identify
semantic table structures in spreadsheets, and TasiError can detect
real-world spreadsheet errors with higher precision (75.2%) and
recall (82.9%) than existing approaches.
Thu 15 JulDisplayed time zone: Brussels, Copenhagen, Madrid, Paris change
01:40 - 02:20 | Session 7 (time band 2) Data Processing Application AnalysisTechnical Papers at ISSTA 1 Chair(s): Darko Marinov University of Illinois at Urbana-Champaign | ||
01:40 20mTalk | SAND: A Static Analysis Approach for Detecting SQL AntipatternsACM SIGSOFT Distinguished Paper Technical Papers Yingjun Lyu Amazon, Sasha Volokh University of Southern California, William G.J. Halfond University of Southern California, Omer Tripp Amazon DOI | ||
02:00 20mTalk | Semantic Table Structure Identification in Spreadsheets Technical Papers Yakun Zhang Institute of Software at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Xiao Lv Microsoft Research, Haoyu Dong Microsoft Research, Wensheng Dou Institute of Software at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Shi Han Microsoft Research, Dongmei Zhang Microsoft Research, Jun Wei Institute of Software at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Dan Ye Institute of Software at Chinese Academy of Sciences; University of Chinese Academy of Sciences DOI Media Attached |
Sat 17 JulDisplayed time zone: Brussels, Copenhagen, Madrid, Paris change
09:30 - 11:10 | Session 27 (time band 3) Bugs and Analysis 2Technical Papers at ISSTA 1 Chair(s): Mike Papadakis University of Luxembourg, Luxembourg | ||
09:30 20mTalk | Faster, Deeper, Easier: Crowdsourcing Diagnosis of Microservice Kernel Failure from User Space Technical Papers Yicheng Pan Peking University, Meng Ma Peking University, Xinrui Jiang Peking University, Ping Wang Peking University DOI Media Attached File Attached | ||
09:50 20mTalk | Finding Data Compatibility Bugs with JSON Subschema CheckingDistinguished Artifact Technical Papers Andrew Habib SnT, University of Luxembourg, Avraham Shinnar IBM Research, Martin Hirzel IBM Research, Michael Pradel University of Stuttgart Link to publication DOI Pre-print File Attached | ||
10:10 20mTalk | Semantic Table Structure Identification in Spreadsheets Technical Papers Yakun Zhang Institute of Software at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Xiao Lv Microsoft Research, Haoyu Dong Microsoft Research, Wensheng Dou Institute of Software at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Shi Han Microsoft Research, Dongmei Zhang Microsoft Research, Jun Wei Institute of Software at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Dan Ye Institute of Software at Chinese Academy of Sciences; University of Chinese Academy of Sciences DOI Media Attached | ||
10:30 20mTalk | Deep Just-in-Time Defect Prediction: How Far Are We? Technical Papers Zhengran Zeng Southern University of Science and Technology, Yuqun Zhang Southern University of Science and Technology, Haotian Zhang Kwai, Lingming Zhang University of Illinois at Urbana-Champaign DOI | ||
10:50 20mTalk | Continuous Test Suite Failure Prediction Technical Papers DOI Media Attached |