APSEC 2024
Tue 3 - Fri 6 December 2024 China

In the open-source community, selecting models that meet user requirements and data distributions is essential due to numerous models with unique characteristics. However, existing model search methods often fail to meet diverse user requirements, varied data distributions, and have slow search speeds. To address these issues, we introduce ModelCS, a two-stage framework for model search based on recall-ranking. Its key idea is to preliminary screening of numerous models using representation learning and then precise ranking of selected ones. Specifically, we study model feature extraction and representation methods. We construct a dataset for this study and propose a rule-based data augmentation method to enhance its diversity. Based on the augmented dataset, we conduct an empirical study and propose the multidimensional feature representation, which influences the design of ModelCS. The recall stage of ModelCS involves a preliminary screening method based on the multidimensional feature representation, while the ranking stage of ModelCS involves a ranking method based on the extension to an existing method. We evaluate ModelCS on the multi-task model zoo in the PaddlePaddle framework. Experimental results indicate that ModelCS can reduce search time by up to 500 times and improve search effectiveness by up to 13.27% compared to existing methods.