Thu 12 May 2022 13:10 - 13:15 at ICSE room 1-odd hours - Green and Sustainable Technologies Chair(s): Grace Lewis
The use of Artificial Intelligence (AI), and more specifically of Deep Learning (DL), in modern software systems, is nowadays widespread and continues to grow. At the same time, its usage is energy demanding and contributes to the increased CO2 emissions, and has a great financial cost as well. Even though there are many studies that examine the capabilities of DL, only a few focus on its green aspects, such as energy consumption. However, optimizing resource utilization used by expensive DL models, without compromizing their accuracy, is crucial for the broad application of DL at a time when climate change is impacting our ecosystem and livelihood.
This paper aims at raising awareness of the costs incurred when using different DL frameworks. To this end, we perform a thorough empirical study to measure and compare the energy consumption and run-time performance of six different DL models written in the two most popular DL frameworks, namely PyTorch and TensorFlow. We use a well-known benchmark of DL models, DeepLearningExamples, created by Nvidia, to compare both the training and inference costs of DL. Finally, we manually investigate the framework functions that took most of the time to execute in our experiments.
The results of our empirical study reveal that there is a statistically significant difference between the cost incurred by the two DL frameworks in 94% of the cases studied. While TensorFlow achieves significantly better energy and run-time performance than PyTorch, and with large effect sizes in 100% of the cases for the training phase, PyTorch allows instead to save significantly more energy and run-time performance than TensorFlow in the inference phase for 66% of the cases, always with large effect sizes. Such a large difference in performance costs does not, however, seem to affect the accuracy of the models produced, as both frameworks achieve comparable scores under the same configurations. Our manual analysis, of the documentation and source code of the functions examined, reveals that such a difference in performance costs is under-documented, in these frameworks.
We argue that developers using DL frameworks could be better supported in achieving Green AI by improving the documentation of the DL frameworks, the source code of the functions, in the DL frameworks, as well as optimizing existing DL algorithms. Moreover, automated techniques for non-functional improvement of software can be explored in future work to make DL software greener.