EASE 2024
Tue 18 - Fri 21 June 2024 Salerno, Italy
Wed 19 Jun 2024 15:05 - 15:20 at Room Capri - Program Comprehension Chair(s): Nicole Novielli

Context. Python’s growing popularity in data analysis and the contemporary emphasis on energy-efficient software tools necessitate an investigation into the energy implications of data operations, particularly in resource-intensive domains like data science. This study provides fundamental insights for library selection, focusing on Pandas, a widely-used Python data manipulation library, and Polars, a Rust-based library known for its performance.

Goal. We aim to compare and analyze the energy usage of Polars and Pandas. The study aims to provide insights for developers and data scientists by identifying scenarios where one library outperforms the other in terms of energy usage while exploring the possible correlations between energy usage and performance metrics.

Method. We performed four separate experiment blocks including 8 Data Analysis Tasks (DATs) from an official TPCH Benchmark done by Polars and 6 Synthetic DATs. Both DATs groups are run with small and large dataframes and for both libraries.

Results. Polars is more energy-efficient than Pandas when dealing with large dataframes. For small dataframes, the TPCH Benchmarking DATs does not show a statistically significant difference, while for the Synthetic DATs, Polars performs significantly better. We identified strong positive correlations between energy usage and execution time, as well as memory usage for Pandas, while Polars did not show significant memory usage correlations for the majority of runs. Additionally, there was a significantly negative correlation between energy usage and CPU usage for Pandas.

Conclusions. The study recommends using Polars for energy-efficient and fast data analysis, emphasizing the importance of CPU core utilization in library selection.

Wed 19 Jun

Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

14:00 - 15:20
Program ComprehensionResearch Papers / Short Papers, Vision and Emerging Results at Room Capri
Chair(s): Nicole Novielli University of Bari
14:00
15m
Talk
Adversarial Attack and Robustness Improvement on Code Summarization
Research Papers
Xi Ding Sun Yat-Sen University, Yuan Huang Sun Yat-sen University, Xiangping Chen Sun Yat-Sen University, Jing Bian Sun Yat-Sen University
14:15
15m
Talk
Understanding Logical Expressions with Negations: Its Complicated
Research Papers
Aviad Baron Hebrew University, Ilai Granot Hebrew University, Ron Yosef Hebrew University, Dror Feitelson Hebrew University
14:30
15m
Talk
A Quantitative Investigation of Trends in Confusing Variable Pairs Through Commits: Do Confusing Variable Pairs Survive?
Research Papers
Hirohisa Aman Ehime University, Sousuke Amasaki Okayama Prefectural University, Tomoyuki Yokogawa Okayama Prefectural University, Minoru Kawahara Ehime University
14:45
10m
Talk
When simplicity meets effectiveness: Detecting code comments coherence with word embeddings and LSTM
Short Papers, Vision and Emerging Results
Michael Dubem Igbomezie University of L'Aquila, Phuong T. Nguyen University of L’Aquila, Davide Di Ruscio University of L'Aquila
Pre-print
14:55
10m
Talk
Exploring Influence of Feature Toggles on Code Complexity
Short Papers, Vision and Emerging Results
Md Tajmilur Rahman Gannon University, Imran Shalabi Gannon University, Tushar Sharma Dalhousie University
15:05
15m
Talk
An Empirical Study on the Energy Usage and Performance of Pandas and Polars Data Analysis Python Libraries
Research Papers
Felix Nahrstedt Vrije Universiteit Amsterdam, Mehdi Karmouche Vrije Universiteit Amsterdam, Karolina Bargieł Vrije Universiteit Amsterdam, Pouyeh Banijamali Vrije Universiteit Amsterdam, Apoorva Nalini Pradeep Kumar Vrije Universiteit Amsterdam, Ivano Malavolta Vrije Universiteit Amsterdam
Pre-print