ISMM 2024
Tue 25 Jun 2024 Copenhagen, Denmark
co-located with PLDI 2024
Tue 25 Jun 2024 16:40 - 17:00 at Iceland - ISMM: Session 4 - Potpourri Chair(s): Tony Hosking

Recent advances in large language models have demonstrated remarkable effectiveness in information retrieval (IR) tasks. While many neural IR systems encode queries and documents into single-vector representations, multi-vector models elevate the retrieval quality by producing multi-vector representations and facilitating similarity searches at the granularity of individual tokens. However, these models significantly amplify memory requirements for retrieval indices by an order of magnitude. This escalation in index size renders the scalability of multi-vector IR models progressively challenging due to their substantial memory demands. We introduce Embedding from Storage Pipelined Network (ESPN) where we offload the entire re-ranking embedding tables to SSDs and reduce the memory requirements by (5-16x). We design a flexible software prefetcher applicable to any hierarchical clustering based search, achieving hit rates exceeding 90%. ESPN improves SSD based retrieval up to (6.4x) and end-to-end throughput by 68% to maintain near-memory levels of query latency even for large query batch sizes. The code is available at https://github.com/susavlsh10/ESPN-v1.

Tue 25 Jun

Displayed time zone: Windhoek change

16:00 - 17:00
ISMM: Session 4 - PotpourriISMM 2024 at Iceland
Chair(s): Tony Hosking Australian National University
16:00
20m
Talk
SSRD: Shapes and Summaries for Race Detection in Concurrent Data StructuresRemote
ISMM 2024
Xiaofan Sun University of California at Riverside, Rajiv Gupta University of California at Riverside
DOI
16:20
20m
Talk
A Heuristic for Periodic Memory Allocation with Little Fragmentation to Train Neural Networks
ISMM 2024
Akifumi Imanishi Preferred Networks, Zijian Xu Preferred Networks
DOI
16:40
20m
Talk
ESPN: Memory-Efficient Multi-vector Information Retrieval
ISMM 2024
Susav Shrestha Texas A&M University, Narasimha Reddy Texas A&M University, Zongwang Li Samsung
DOI