Outcome-Preserving Input Reduction for Scientific Data Analysis Workflows
Analysis of data is the foundation of multiple scientific disciplines, manifesting nowadays in complex and diverse scientific workflows often involving exploratory analyses. Such analyses represent a particular case for traditional data engineering workflows, as results may be hard to interpret and judge whether they are correct or not, and where experimentation is a central theme. Input data, or assumptions made about it may be incorrect and may need to be refined – an analogous problem to fault localization in software engineering. Typical techniques assume that a fault is identified, usually by an oracle in the form of a test. The workflows we target however are usually explorative, which makes it hard – if not impossible – to define tests specifying correct behaviour, while spotting irregularities is highly desired. To this end, we advocate data input reduction such that a specified outcome is preserved, aiding debugging. In our proposal, reductions are used as debug hypotheses for data. We outline our bold vision on building engineering support for outcome-preserving input reduction within data analysis workflows, and report on preliminary results.
Thu 13 OctDisplayed time zone: Eastern Time (US & Canada) change
13:30 - 15:30 | Technical Session 27 - Dynamic and Concolic AnalysisResearch Papers / NIER Track / Journal-first Papers at Banquet A Chair(s): ThanhVu Nguyen George Mason University | ||
13:30 20mResearch paper | LISSA: Lazy Initialization with Specialized Solver Aid Research Papers Juan Manuel Copia IMDEA Software Institute; Universidad Politécnica de Madrid, Pablo Ponzio Dept. of Computer Science FCEFQyN, University of Rio Cuarto, Nazareno Aguirre University of Rio Cuarto and CONICET, Argentina, Alessandra Gorla IMDEA Software Institute, Marcelo F. Frias Dept. of Software Engineering Instituto Tecnológico de Buenos Aires | ||
13:50 10mVision and Emerging Results | Outcome-Preserving Input Reduction for Scientific Data Analysis Workflows NIER Track Anh Duc Vu Humboldt-Universität zu Berlin, Timo Kehrer University of Bern, Christos Tsigkanos University of Bern, Switzerland Pre-print | ||
14:00 20mResearch paper | SymFusion: Hybrid Instrumentation for Concolic Execution Research Papers Emilio Coppa Sapienza University of Rome, Heng Yin UC Riverside, Camil Demetrescu Sapienza University Rome Pre-print | ||
14:20 20mResearch paper | Scalable Sampling of Highly-Configurable Systems: Generating Random Instances of the Linux Kernel Research Papers David Fernandez-Amoros UNED, Ruben Heradio UNED (Universidad Nacional de Educacion a Distancia), Christoph Mayr-Dorn JOHANNES KEPLER UNIVERSITY LINZ, Alexander Egyed Johannes Kepler University Linz | ||
14:40 20mPaper | A Practical Approach for Dynamic Taint Tracking with Control-Flow RelationshipsVirtual Journal-first Papers Link to publication DOI Pre-print Media Attached | ||
15:00 20mResearch paper | Prioritized Constraint-Aided Dynamic Partial-Order ReductionVirtual Research Papers Jie Su Xidian University, Cong Tian Xidian University, Zuchao Yang Xidian University, Jiyu Yang Xidian University, Bin Yu Xidian University, Zhenhua Duan Xidian University |