Understanding and Utilizing Hardware Transactional Memory Capacity
Hardware transactional memory (HTM) provides a simpler programming model than lock-based synchronization. However, the hardware has limits that mean that HTM transactions may suffer costly aborts due to hardware capacity. Understanding HTM capacity is therefore critical to utilizing HTM. Unfortunately, crucial implementation details are undisclosed. In practice HTM capacity can manifest in confounding ways. It is therefore unsurprising that the literature reports results that appear to be highly contradictory, reporting capacities that vary by nearly three orders of magnitude. We conduct an in-depth study into the causes of HTM capacity aborts using four generations of Intel’s Transactional Synchronization Extension (TSX). We identify the apparent contradictions among prior work and by extending their methodologies are able to shed new light on the likely causes of HTM capacity aborts. In doing so, we reconcile the apparent contradictions. We focus on how replacement policies and the status of the cache can affect HTM capacity.
One source of surprising behavior appears to be the cache replacement policies used by the processors we evaluated. Both invalidating the cache and warming it up with the transactional working set can significantly improve the read capacity of transactions across the microarchitectures we tested. A further complication is that a physically indexed LLC will typically yield only half the total LLC capacity. We found that methodological differences in the prior work led to different warm up states and thus to their apparently contradictory findings. This paper deepens our understanding of how the underlying implementation and cache behavior affect the apparent capacity of HTM. Our insights on how to improve the read capacity of transactions can be used to optimize HTM applications, particularly those with large read transactions, such as those used in optimistic parallelization.
Conference DayTue 22 JunDisplayed time zone: Eastern Time (US & Canada) change
18:00 - 21:00
|Exploiting Intel Optane Persistent Memory for Full Text Search|
Shoaib AkramAustralian National UniversityPre-print File Attached
|Understanding and Utilizing Hardware Transactional Memory Capacity|
Zixian CaiAustralian National University, Steve BlackburnAustralian National University, Michael D. BondOhio State University, USALink to publication DOI Media Attached
|Fusuma: Double-ended Threaded Compaction|