Deploying Language Models on Android-Based Edge Devices: A Practical Evaluation Pipeline
This program is tentative and subject to change.
Small Language Models (SLMs) are increasingly being deployed on mobile and edge devices, offering benefits such as reduced infrastructure costs, improved privacy, and offline access. However, deploying these models on resource-constrained hardware presents challenges in terms of performance, stability, and memory usage, which remain underexplored in real-world scenarios. This study proposes a replicable evaluation pipeline to identify SLMs suitable for deployment on resource-constrained Android edge devices, using mainly Android TV as a low-resource platform. The first phase evaluates SLMs (65M–1B parameters) on a smart TV with an ARM Cortex-A55 processor, collecting performance and quality metrics. This phase reveals critical constraints related to memory footprint and inference quality. The second phase presents a comparative analysis of inference acceleration techniques across consumer Android devices, such as TVs and smartphones. It systematically evaluates the impact of low-level linear algebra libraries and various inference engine implementations executed on Android TV. Results show that choosing the right inference engine—especially MNN, in our context—yields up to 12× speedups over alternatives. These findings offer actionable guidance for deploying SLMs in production on constrained devices. The pipeline supports automation and reproducibility, aiding real-world integration of generative AI at the edge.
This program is tentative and subject to change.
Tue 18 NovDisplayed time zone: Seoul change
16:00 - 17:00 | |||
16:00 10mTalk | Adaptive Performance Regression Detection via Semi-Supervised Siamese Learning Industry Showcase Yongqian Sun Nankai University, Mengyao Li Nankai University, Xiao Xiong Nankai University, Lei Tao Nankai University, Yimin Zuo Nankai University, Wenwei Gu The Chinese University of Hong Kong, Shenglin Zhang Nankai University, Junhua Kuang Nankai University, Yu Luo Nankai University, Huandong Zhuang Huawei Cloud, Bowen Deng Huawei Cloud, Dan Pei Tsinghua University | ||
16:10 10mTalk | Deploying Language Models on Android-Based Edge Devices: A Practical Evaluation Pipeline Industry Showcase Suayder Costa Venturus - Innovation & Technology, Igor Lima Venturus - Innovation & Technology, William Harada Venturus - Innovation & Technology, Mateus Lucena Venturus - Innovation & Technology, Arthur Alves Venturus - Innovation & Technology, Ruan Belem TPV Technology, Agemilson Pimentel TPV Technology, Rômulo Fabrício TPV Technology, Alexandre Miranda Paulo Feitoza Foundation- FPFTech, Daniel Lins Venturus - Innovation & Technology, Frederico Goncalves Venturus - Innovation & Technology, Sidney Leal Venturus - Innovation & Technology | ||
16:20 10mTalk | How Can Infrastructure as Code Accelerate Data Center Bring-ups? A Case Study at ByteDance Industry Showcase Xianhao Jin ByteDance, Yifei Feng ByteDance, Yufei Gao ByteDance, Yongning Hu ByteDance, Jie Huang ByteDance, Kun Xia ByteDance, Luchuan Guo ByteDance Pre-print | ||
16:30 10mTalk | MobileUPReg: Identifying User-Perceived Performance Regressions in Mobile OS Versions Industry Showcase Wei Liu Concordia University, Montreal, Canada, Yi Wen HENG Concordia University, Feng Lin Concordia University, Tse-Hsun (Peter) Chen Concordia University, Ahmed E. Hassan Queen’s University | ||
16:40 10mTalk | Context-Aware CodeLLM Eviction for AI-assisted Coding Industry Showcase Kishanthan Thangarajah Centre for Software Excellence, Huawei Canada, Boyuan Chen Centre for Software Excellence, Huawei Canada, Shi Chang University of Western Ontario, Ahmed E. Hassan Queen’s University | ||
16:50 10mTalk | Thinking Longer, Not Larger: Enhancing Software Engineering Agents via Scaling Test-Time Compute Industry Showcase Yingwei Ma Tongyi Lab, Alibaba, Yongbin Li Tongyi Lab, Alibaba, China, Yihong Dong Peking University, Xue Jiang , Yanhao Li Tongyi Lab, Alibaba, Yue Liu Monash University, Rongyu Cao Tongyi Lab, Alibaba, China, Jue Chen Tongyi Lab, Alibaba, China, Fei Huang Tongyi Lab, Alibaba, China, Binhua Li Tongyi Lab, Alibaba, China | ||