Software Pre-execution for Irregular Memory Accesses in the HBM Era
The introduction of High Bandwidth Memory (HBM) necessitates the use of intelligent software prefetching in irregular applications to utilize the surplus bandwidth. In this work, we propose Software Pre-execution (SPE), a technique that relies on pre-executing a minimal copy of the loop of concern (we call the pre-execution loop) for the purpose of prefetching irregular accesses. This is complemented by the compiler’s enforcing a certain prefetch distance through apriori strip-mining of the original loop such that the execution of the pre-execution loop is interspersed with the main loop to ensure timeliness of prefetches. We find that this approach provides natural advantages over prior art such as preservation of loop vectorization, handling short loops, avoiding performance bottlenecks, amenability to threading and most importantly, effective coverage. We demonstrate these advantages using a variety of benchmarks on Fujitsu’s A64FX processor with HBM2 memory - we outperform prior art by 1.3x and 1.2x when using small and huge pages, respectively. Simulations further show that our approach holds stronger promise on upcoming processors with HBM2e.
Wed 6 AprDisplayed time zone: Eastern Time (US & Canada) change
13:00 - 14:00 | Session 6: Performance OptimizationsCC Research Papers at CC Virtual Room Chair(s): Doru Thom Popovici Lawrence Berkeley National Lab | ||
13:00 15mPaper | Loner: Utilizing the CPU Vector Datapath to Process Scalar Integer Data CC Research Papers Armand Behroozi University of Michigan, Sunghyun Park University of Michigan, Scott Mahlke University of Michigan DOI | ||
13:15 15mPaper | Mapping Parallelism in a Functional IR through Constraint Satisfaction CC Research Papers Naums Mogers University of Edinburgh, Lu Li University of Edinburgh, Valentin Radu University of Sheffield, Christophe Dubach McGill University DOI | ||
13:30 15mPaper | Software Pre-execution for Irregular Memory Accesses in the HBM Era CC Research Papers DOI | ||
13:45 15mPaper | Efficient Profile-Guided Size Optimization for Native Mobile Applications CC Research Papers DOI |