Understanding and Utilizing Hardware Transactional Memory Capacity (ISMM 2021)

Who

Zixian Cai, Steve Blackburn, Michael D. Bond

Track

ISMM 2021

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 22 Jun 2021 18:30 - 19:00 at ISMM - Session 4: Compacting/Indexing/Transactioning & Closing Chair(s): Timothy M. Jones

Abstract

Hardware transactional memory (HTM) provides a simpler programming model than lock-based synchronization. However, the hardware has limits that mean that HTM transactions may suffer costly aborts due to hardware capacity. Understanding HTM capacity is therefore critical to utilizing HTM. Unfortunately, crucial implementation details are undisclosed. In practice HTM capacity can manifest in confounding ways. It is therefore unsurprising that the literature reports results that appear to be highly contradictory, reporting capacities that vary by nearly three orders of magnitude. We conduct an in-depth study into the causes of HTM capacity aborts using four generations of Intel’s Transactional Synchronization Extension (TSX). We identify the apparent contradictions among prior work and by extending their methodologies are able to shed new light on the likely causes of HTM capacity aborts. In doing so, we reconcile the apparent contradictions. We focus on how replacement policies and the status of the cache can affect HTM capacity.

One source of surprising behavior appears to be the cache replacement policies used by the processors we evaluated. Both invalidating the cache and warming it up with the transactional working set can significantly improve the read capacity of transactions across the microarchitectures we tested. A further complication is that a physically indexed LLC will typically yield only half the total LLC capacity. We found that methodological differences in the prior work led to different warm up states and thus to their apparently contradictory findings. This paper deepens our understanding of how the underlying implementation and cache behavior affect the apparent capacity of HTM. Our insights on how to improve the read capacity of transactions can be used to optimize HTM applications, particularly those with large read transactions, such as those used in optimistic parallelization.

Link to Publication

https://dl.acm.org/doi/10.1145/3459898.3463901

DOI

https://doi.org/10.1145/3459898.3463901

Zixian Cai

Australian National University

Australia

Steve Blackburn

Australian National University

Australia

Michael D. Bond

Ohio State University, USA

Software Artifact

Media