VizSmith: Automated Visualization Synthesis by Mining Data-Science Notebooks
Visualizations are widely used to communicate findings and make data-driven decisions. Unfortunately creating bespoke and reproducible visualizations requires the use of procedural tools such as matplotlib. These tools present a steep learning curve as their documentation often lacks sufficient usage examples to help beginners get started or accomplish a specific task. Forums such as StackOverflow have long helped developers search for code online and adapt it for their use. However such forums still place the burden on the developer to sift through results and understand the code before adapting it for their use.
We build a tool VizSmith which improves \emph{code reuse} for visualizations by mining visualization code from Kaggle notebooks and creating a database of 7176 \emph{reusable} Python functions. Given a dataset, columns to visualize and a text query from the user, VizSmith uses this database to search for appropriate functions, runs them and displays the generated visualizations to the user. At the core of VizSmith is a novel metamorphic testing based approach to automatically assess the reusability of functions, which improves end-to-end synthesis performance by 10% and cuts number of execution failures by 50%.
Tue 16 NovDisplayed time zone: Hobart change
18:00 - 19:00 | Mining and IssuesNIER track / Research Papers at Koala Chair(s): Hongyu Zhang University of Newcastle | ||
18:00 20mTalk | VizSmith: Automated Visualization Synthesis by Mining Data-Science Notebooks Research Papers Rohan Bavishi University of California at Berkeley, Shadaj Laddad UC Berkeley, Hiroaki Yoshida Fujitsu Laboratories of America, Inc., Mukul Prasad Fujitsu Research of America, Koushik Sen University of California at Berkeley | ||
18:20 20mTalk | ISPY: Automatic Issue-Solution Pair Extraction from Community Live Chats Research Papers Lin Shi Institute of Software at Chinese Academy of Sciences, Ziyou Jiang Institute of Software at Chinese Academy of Sciences, Ye Yang Stevens Institute of Technology, Xiao Chen Institute of Software at Chinese Academy of Sciences, YuMin Zhang Institute of Software Chinese Academy of Sciences, Fangwen Mu Institute of Software Chinese Academy of Sciences, Hanzhi Jiang Institute of Software at Chinese Academy of Sciences, Qing Wang Institute of Software at Chinese Academy of Sciences Pre-print | ||
18:40 10mTalk | Understanding Code Fragments with Issue Reports NIER track | ||
18:50 10mTalk | An Empirical Study on Obsolete Issue Reports NIER track |