Sun 14 Apr 2019 13:30 - 13:55 at Garden Room - Session2

The rise in instruction set architecture (ISA) diversity and the growing
adoption of virtual machines are driving a need for fast, scalable,
full-system, cross-ISA emulation and instrumentation tools. Unfortunately,
achieving high performance for these cross-ISA tools is challenging due to
dynamic binary translation (DBT) overhead and the complexity of instrumenting
full-system emulators.

In this paper we improve cross-ISA emulation and instrumentation performance
through three novel techniques. First, we increase floating point (FP)
emulation performance by observing that most FP operations can be correctly
emulated by surrounding the use of the host FP unit with a minimal amount
of non-FP code. Second, we introduce the design of a
translator with a shared code cache that scales for multi-core guests, even
when they generate translated code in parallel at a high rate. Third, we present
an ISA-agnostic instrumentation layer that can instrument guest operations
that occur outside of the DBT's intermediate representation (IR), which are
common in full-system emulators.

We implement our approach in Qelt, a high-performance cross-ISA machine emulator
and instrumentation tool based on QEMU. Our results show that Qelt scales
to 32 cores when emulating a guest machine used for parallel compilation,
which demonstrates scalable code translation. Furthermore, experiments
based on SPEC06 show that Qelt (1) outperforms QEMU as a full-system cross-ISA
machine emulator by $1.76\times$/$2.18\times$ for integer/FP workloads,
(2) outperforms state-of-the-art, cross-ISA, full-system instrumentation
tools by $1.5\times$-$3\times$, and (3) can match the performance of Pin, a
state-of-the-art, same-ISA DBI tool, when used for complex instrumentation such
as cache simulation.

Sun 14 Apr

Displayed time zone: Eastern Time (US & Canada) change

13:30 - 15:35
13:30
25m
Talk
Cross-ISA Machine Instrumentation Using Fast and Scalable Dynamic Binary Translation
Research Papers
Emilio G. Cota Columbia University, USA, Luca P. Carloni Columbia University, USA
13:55
25m
Talk
The Janus Triad: Exploiting Parallelism through Dynamic Binary Modification
Research Papers
Ruoyu Zhou University of Cambridge, UK, George Wort University of Cambridge, UK, Marton Erdos University of Cambridge, UK, Timothy M. Jones University of Cambridge, UK
14:20
25m
Talk
Mitigating JIT Compilation Latency in Virtual Execution Environments
Research Papers
Martin Kristien University of Edinburgh, UK, Tom Spink University of Edinburgh, Harry Wagstaff University of Edinburgh, UK, Björn Franke University of Edinburgh, UK, Igor Böhm Synopsys, Austria, Nigel Topham University of Edinburgh, UK
14:45
25m
Talk
ScissorGC: Scalable and Efficient Compaction for Java Full Garbage Collection
Research Papers
Haoyu Li Shanghai Jiao Tong University, China, Mingyu Wu Shanghai Jiao Tong University, China, Binyu Zang Shanghai Jiao Tong University, China, Haibo Chen Shanghai Jiao Tong University, China
15:10
25m
Talk
Stochastic Resource Allocation
Research Papers
Liran Funaro Technion, Israel, Orna Agmon Ben-Yehuda Technion, Israel, Assaf Schuster Technion, Israel