DSInfoSearch: Supporting experimentation process of data scientists
Thu 18 Nov 2021 10:04 - 10:06 at Kangaroo - LBR + DS Poster (2) (Thursday 21:00 - 00:00) Chair(s): Xiaoyin Wang
Experimentation plays an important role in the work of data scientists to explore unfamiliar problem domains, to answer questions from the data, and to develop diverse machine learning applications. Good experimentation requires creativity, based on prior results and informed from the literature. However, finding relevant information from relevant sources to guide experimentation is causing inefficiencies during experimentation process of data scientists. The objective of this research is to help data scientists through the presentation of context aware ranked data science experiments, considering problem domain, development task and learning task. Data science experiments for this study were extracted from publicly available interactive notebooks and were manually annotated based on a taxonomy of data science techniques and a meta model of a data science experiment. Further, the ranking algorithm was developed for data science experiments for given problem domain and development task. As a result, a tool was developed to demonstrate context aware ranked data science experiments for given problem domains such as natural language processing, computer vision and time series and for development stages such as feature engineering and model selection. This study shows that tools and techniques can be designed to be more data science context aware, in fact, much more so than for software engineering tools. This study supports these efforts by providing knowledge that can improve experimentation process of data scientists.
DSInfoSearch: Supporting experimentation process of data scientists (ase21-docsym-46.pdf) | 612KiB |
Mon 15 NovDisplayed time zone: Hobart change
09:05 - 10:20 | |||
09:05 15mTalk | A Prediction Model for Software Requirements Change Impact Doctoral Symposium Kareshna Zamani PhD candidate File Attached | ||
09:20 15mTalk | DSInfoSearch: Supporting experimentation process of data scientists Doctoral Symposium Shangeetha Sivasothy Applied Artificial Intelligence Institute, Deakin University File Attached | ||
09:35 15mTalk | Towards the generation of machine learning defect reports Doctoral Symposium Tuan Dung Lai Deakin University Pre-print File Attached | ||
09:50 15mTalk | Leveraging Code Clones and Natural Language Processing for Log Statement Prediction Doctoral Symposium Sina Gholamian University of Waterloo Pre-print | ||
10:05 15mOther | Discussion with presenters Doctoral Symposium |
Thu 18 NovDisplayed time zone: Hobart change
10:00 - 11:00 | LBR + DS Poster (2) (Thursday 21:00 - 00:00)Late Breaking Results / Doctoral Symposium at Kangaroo Chair(s): Xiaoyin Wang University of Texas at San Antonio | ||
10:00 2mTalk | API Compatibility Issue Detection, Testing and Analysis for Android Apps Doctoral Symposium Tarek Mahmud Texas State University File Attached | ||
10:02 2mTalk | Towards the generation of machine learning defect reports Doctoral Symposium Tuan Dung Lai Deakin University Pre-print File Attached | ||
10:04 2mTalk | DSInfoSearch: Supporting experimentation process of data scientists Doctoral Symposium Shangeetha Sivasothy Applied Artificial Intelligence Institute, Deakin University File Attached | ||
10:06 2mTalk | A First Look at the Effect of Deep Learning inCoverage-guided Fuzzing Late Breaking Results Siqi Li Tianjin University, Yun Lin National University of Singapore, Xiaofei Xie Kyushu University, Yuekang Li Nanyang Technological University, Xiaohong Li TianJin University, Weimin Ge Tianjin University, Yang Liu Nanyang Technological University, Jin Song Dong National University of Singapore | ||
10:08 2mTalk | Counterexample Guided Inductive Repair of Reactive Contracts Late Breaking Results Soha Hussein University of Minnesota, USA / Ain Shams University, Egypt, Vaibhav Sharma University of Minnesota, USA, Stephen McCamant University of Minnesota, USA, Sanjai Rayadurgam University of Minnesota, Mats Heimdahl University of Minnesota | ||
10:10 2mTalk | AST-Transformer: Encoding Abstract Syntax TreesEfficiently for Code Summarization Late Breaking Results Ze Tang Software Institute, Nanjing University, Chuanyi Li Software Institute, Nanjing University, Jidong Ge , Xiaoyu Shen Alexa AI, Amazon, Zheling Zhu Software Institute, Nanjing University, Bin Luo Software Institute, Nanjing University | ||
10:12 2mTalk | An Automated Pipeline for Privacy Leak Analysis of Android Applications Doctoral Symposium Yifan Zhou The University of Adelaide File Attached | ||
10:14 2mTalk | Detecting Adversarial Samples with Graph-Guided Testing Late Breaking Results Zuohui Chen Zhejiang University of Technology, Renxuan Wang Zhejiang University of Technology, Jingyang Xiang Zhejiang University of Technology, Yue Yu College of Computer, National University of Defense Technology, Changsha 410073, China, Xin Xia Huawei Software Engineering Application Technology Lab, Shouling Ji Zhejiang University, Qi Xuan Zhejiang University of Technology, Xiaoniu Yang Zhejiang University of Technology | ||
10:16 2mTalk | Using Static Analysis to Address Microservice Architecture Reconstruction Late Breaking Results Vincent Bushong Baylor University, Dipta Das Baylor University, Abdullah Al Maruf Baylor University, Tomas Cerny Baylor University | ||
10:18 2mTalk | Applying Semi-Automated Hyperparameter Tuning for Clustering Algorithms Late Breaking Results Elizabeth Forest James Cook University, Anne Swinbourne James Cook University, Trina Myers Queensland University of Technology, Mitchell Scovell James Cook University Link to publication | ||
10:20 2mTalk | Business Process Extraction Using Static Analysis Late Breaking Results | ||
10:22 2mTalk | Binary Code Similarity Detection Doctoral Symposium Zian Liu Swinburne University of Technology; Data61, CSIRO, Chao Chen James Cook University, Jun Zhang Digital Research & Innovation Capability Platform, Swinburne University of Technology, Dongxi Liu Data61, CSIRO, Muhammad Ejaz Ahmed Data61, CSIRO, Yang Xiang Digital Research & Innovation Capability Platform, Swinburne University of Technology File Attached | ||
10:24 2mTalk | Improving Mutation-Based Fault Localization with Plausible-code Generating Mutation Operators Late Breaking Results | ||
10:26 2mTalk | Using Version Control and Issue Tickets to detect Code Debt and Economical Cost Late Breaking Results Abdullah Al Maruf Baylor University, Noah Lambaria Baylor University, Amr Elsayed Baylor University, Tomas Cerny Baylor University File Attached | ||
10:28 2mTalk | Human-in-the-Loop XAI-enabled Vulnerability Detection, Investigation, and Mitigation Late Breaking Results Tien N. Nguyen University of Texas at Dallas, Kim-Kwang Raymond Choo University of Texas at San Antonio | ||
10:30 2mTalk | A Prediction Model for Software Requirements Change Impact Doctoral Symposium Kareshna Zamani PhD candidate File Attached | ||
10:32 2mTalk | Leveraging Code Clones and Natural Language Processing for Log Statement Prediction Doctoral Symposium Sina Gholamian University of Waterloo Pre-print |