VEE 2022
Tue 1 Mar 2022
Tue 1 Mar 2022 11:15 - 11:35 at Online - Session-1: System Virtualization Chair(s): Antonio Barbalace

Recently, Deep Learning (DL) models have demonstrated great success for its attractive ability of high accuracy used in artificial intelligence Internet of Things applications. A common deployment solution is to run such DL inference tasks on edge servers. In a DL inference, each operator takes tensors as input and run in a tensor virtual machine, which isolates resource usage among operators. Nevertheless, existing edge-based DL inference approaches can not efficiently use heterogeneous resources (e.g., CPU and low-end GPU) on edge servers and result in sub-optimal DL inference performance, since they can only partition operators in a DL inference with equal or fixed ratios. It is still a big challenge to support partition optimizations over edge servers for a wide range of DL models, such as Convolution Neural Network (CNN), Recurrent Neural Network (RNN) and Transformers. In this paper, we present EOP, an Efficient Operator Partition approach to optimize DL inferences over edge servers, to address this challenge. Firstly, we carry out a large-scale performance evaluation on operators running on heterogeneous resources, and reveal that many operators do not follow similar performance variation when input tensors change. Further, we employ three categorized patterns to estimate the performance of operators, and then efficiently partition key operators and tune partition ratios. Finally, we implement EOP on TVM, and experiments over a typical edge server show that EOP improves the inference performance by up to $1.25-1.97\times$ for various DL models compared to state-of-the-art approaches.

Tue 1 Mar

Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

10:15 - 11:35
Session-1: System VirtualizationResearch Papers at Online
Chair(s): Antonio Barbalace The University of Edinburgh
10:15
20m
Talk
Portkey: Hypervisor-assisted container migration in nested cloud environments
Research Papers
Chandra Prakash Indian Institute of Technology Bombay, Debadatta Mishra , Purushottam Kulkarni Indian Institute of Technology, Bombay, Umesh Bellur IIT Bombay
10:35
20m
Talk
Container-aware I/O Stack: Bridging the Gap between Container Storage Drivers and Solid State Devices
Research Papers
Song Wu Huazhong University of Science and Technology, China, Zhuo Huang Huazhong University of Science and Technology, Pengfei Chen Huazhong University of Science and Technology, Hao Fan Huazhong University of Science and Technology, Shadi Ibrahim Inria, Hai Jin Huazhong University of Science and Technology
10:55
20m
Talk
ClusterRR: A Record and Replay Framework for Virtual Machine Cluster
Research Papers
Wei Wang Institute of Information Engineering, School of Cyber Security, University of Chinese Academy of Sciences, Zhiyu Hao Institute of Information Engineering, Chinese Academy of Sciences, Lei Cui Institute of Information Engineeringļ¼ŒChinese Academy of Sciences
11:15
20m
Talk
EOP: Efficient Operator Partition for Deep Learning Inference Over Edge Servers
Research Papers
Yuanjia XU University of Chinese Academy of Sciences; Institute of Software, Chinese Academy of Sciences, Heng WU Institute of Software, Chinese Academy of Sciences, Wenbo ZHANG Institute of Software, Chinese Academy of Sciences; State Key Laboratory of Computer Sciences, Institute of Software, Chinese Academy of Sciences, Yi HU University of Chinese Academy of Sciences; Institute of Software, Chinese Academy of Sciences

Information for Participants
Tue 1 Mar 2022 10:15 - 11:35 at Online - Session-1: System Virtualization Chair(s): Antonio Barbalace
Info for session

The Zoom room for Session 1 is at https://rochester.zoom.us/j/98375917164?pwd=ZHRvcy85elRVUWtDaGRZQkl6dENTQT09.