An Exploratory Study on Energy Consumption of Dataframe Processing Libraries
The energy consumption of machine learning applications and their impact on the environment has recently gained attention as a research area, focusing on the model creation and training/inference phases. However, the data-oriented stages of the machine learning pipeline, which involve preprocessing, cleaning, and exploratory analysis, are critical components. However, energy consumption during these stages has received limited attention. To fill this gap, as a first step, we aim to investigate the energy consumption of three popular dataframe processing libraries, namely Pandas, Vaex, and Dask. We perform experiments across 21 dataframe processing operations within four categories, utilizing three distinct datasets. Our results indicate that no single library is the most energy efficient for all tasks, and the choice of a library can have a significant impact on energy consumption based on the types and frequencies of operations performed. The findings of this study suggest the potential for optimization of the energy consumption of data-oriented stages in the machine learning pipeline and warrant further research in this area.
Mon 15 MayDisplayed time zone: Hobart change
16:35 - 17:20 | Ethics & EnergyTechnical Papers / Registered Reports at Meeting Room 109 Chair(s): Arumoy Shome Delft University of Technology | ||
16:35 12mTalk | Energy Consumption Estimation of API-usage in Mobile Apps via Static Analysis Technical Papers Abdul Ali Bangash University of Alberta, Canada, Qasim Jamal FAST National University, Kalvin Eng University of Alberta, Karim Ali University of Alberta, Abram Hindle University of Alberta Pre-print | ||
16:47 12mTalk | An Exploratory Study on Energy Consumption of Dataframe Processing Libraries Technical Papers Pre-print | ||
16:59 6mTalk | Understanding issues related to personal data and data protection in open source projects on GitHub Registered Reports Anne Hennig Karlsruhe Institute of Technology, Lukas Schulte Universitity of Passau, Steffen Herbold University of Passau, Oksana Kulyk IT University of Copenhagen, Denmark, Peter Mayer University of Southern Denmark | ||
17:05 12mTalk | Whistleblowing and Tech on Twitter Technical Papers Laura Duits Vrije Universiteit Amsterdam, Isha Kashyap Vrije Universiteit Amsterdam, Joey Bekkink Vrije Universiteit Amsterdam, Kousar Aslam Vrije Universiteit Amsterdam, Emitzá Guzmán Vrije Universiteit Amsterdam |