Write a Blog >>
ISMM 2018
co-located with PLDI 2018
Mon 18 Jun 2018 15:00 - 15:30 at Discovery AB - Optimizing for the Web and the Cloud Chair(s): Christine H. Flood

While single machine MapReduce systems can squeeze out maximum performance from available multi-cores, they are often limited by the size of main memory and can thus only process small datasets. Our experience shows that the state-of-the-art single-machine in-memory MapReduce system Metis frequently experiences out-of-memory crashes. Even though today's computers are equipped with efficient secondary storage devices, the frameworks do not utilize these devices mainly because disk access latencies are much higher than those for main memory. Therefore, the single-machine setup of the Hadoop system performs much slower when it is presented with the datasets which are larger than the main memory. Moreover, such frameworks also require tuning a lot of parameters which puts an added burden on the programmer. In this paper we present OMR, an Out-of-core MapReduce system that not only successfully handles datasets that are far larger than the size of main memory, it also guarantees linear scaling with the growing data sizes. OMR actively minimizes the amount of data to be read/written to/from disk via on-the-fly aggregation and it uses block sequential disk read/write operations whenever disk accesses become necessary to avoid running out of memory. We theoretically prove OMR's linear scalability and empirically demonstrate it by processing datasets that are up to 5x larger than main memory. Our experiments show that in comparison to the standalone single-machine setup of the Hadoop system, OMR delivers far higher performance. Also in contrast to Metis, OMR avoids out-of-memory crashes for large datasets as well as delivers higher performance when datasets are small enough to fit in main memory.

Mon 18 Jun

Displayed time zone: Eastern Time (US & Canada) change

14:00 - 15:30
Optimizing for the Web and the CloudISMM 2018 at Discovery AB
Chair(s): Christine H. Flood Red Hat
14:00
30m
Talk
Hardware-Software Co-optimization of Memory Management in Dynamic Languages
ISMM 2018
Mohamed Ismail Cornell University, USA, G. Edward Suh Cornell University, USA
14:30
30m
Talk
Dynamic Vertical Memory Scalability for OpenJDK Cloud Applications
ISMM 2018
Rodrigo Bruno INESC-ID / Instituto Superior Técnico, University of Lisbon, Paulo Ferreira INESC-ID / Instituto Superior Técnico, University of Lisbon, Ruslan Synytsky Jelastic, n.n., Tetiana Fydorenchyk Jelastic, n.n., Jia Rao University of Texas at Arlington, USA, Hang Huang Huazhong University of Science and Technology, China, Song Wu Huazhong University of Science and Technology, China
15:00
30m
Talk
OMR: Out-of-Core MapReduce for Large Data Sets
ISMM 2018
Gurneet Kaur , Keval Vora University of California, Riverside, Sai Charan Koduru University of California, Riverside, Rajiv Gupta UC Riverside