Uninterrupted uptime is a critical aspect of Virtual Machines (VMs) offered by cloud hosting providers. Our VMs run on top of rapidly changing infrastructure: we regularly update hardware and host software, and we must quickly respond to failing hardware. Frequent change is critical to both development velocity—deploying new versions of services and infrastructure—and the ability to respond rapidly to defects, including critical security fixes. Typically these updates would be disruptive, resulting in VM termination or restart. In this paper we present how we use VM live migration at scale to eliminate this disruption with minimal impact to the guest, performing over 1,000,000 migrations monthly in our production fleet, with 50ms median blackout, 300ms 99th percentile blackout.
Sun 25 MarDisplayed time zone: Eastern Time (US & Canada) change
14:00 - 15:30 | |||
14:00 30mTalk | gMig: Efficient GPU Live Migration Optimized by Software Dirty Page for Full Virtualization Research Papers Jiacheng Ma , Xiao Zheng Intel Corporation, Yaozu Dong Intel Asia-Pacific R&D Ltd, China, Wentai Li Shanghai Jiao Tong University, Zhengwei Qi Shanghai Jiao Tong University, Bingsheng He National University of Singapore, Haibing Guan Shanghai Jiao Tong University | ||
14:30 30mTalk | VM Live Migration At Scale Research Papers Adam Ruprecht Google, Danny Jones Google, Dmitry Shiraev Google, Greg Harmon Google, Maya Spivak Google, Michael Krebs Google, Miche Baker-Harvey Google, Tyler Sanderson Google | ||
15:00 30mTalk | Demon: An Efficient Solution for on-Device MMU Virtualization in Mediated Pass-Through Research Papers Yu Xu Shanghai Jiao Tong University, Jianguo Yao Shanghai Jiao Tong University, Yaozu Dong Intel Asia-Pacific R&D Ltd, China, Kun Tian Intel Corporation, Xiao Zheng Intel Corporation, Haibing Guan Shanghai Jiao Tong University |