Visualization is very important for machine learning (ML) pipelines because it can show explorations of the data to inspire data scientists and show explanations of the pipeline to improve understandability and trust. In this paper, we present a novel approach that automatically generates visualizations for ML pipelines by learning visualizations from highly-voted Kaggle pipelines. The solution extracts both code and dataset features from these high-quality human-written pipelines and corresponding training datasets, learns the mapping rules from code and dataset features to visualizations using association rule mining (ARM), and finally uses the learned rules to predict visualizations for unseen ML pipelines. The evaluation results show that the proposed solution is feasible and effective to generate visualizations for ML pipelines.
Lei Liu Fujitsu Laboratories of America, Inc., Wei-Peng Chen Fujitsu Research of America, Inc., Mehdi Bahrami Fujitsu Laboratories of America, Inc., Mukul Prasad Amazon Web Services