Attributed Multiplex Learning for Analogical Third-Party Library Recommendation and Retrieval
Third-party libraries (TPLs) play a critical role in modern software development by providing reusable code that accelerates project development. However, the vast number of TPLs available makes selecting the appropriate library for a given task or finding replacements for deprecated libraries a challenging task. Existing methods are limited, relying only on mining-based approaches or feature-based solutions. In this study, we propose an innovative attributed multiplex learning approach that combines both textual and relational data across multiple layers to perform effective analogical library recommendation and retrieval. By representing libraries as nodes with attributes and modeling cross-library relationships as graph edges, our method constructs an attributed multiplex network for TPL representation embeddings. Our approach uses a unified, concise model to include different aspects of information. The proposed inductive model can also address cold-start issues. Moreover, our model is scalable and can adapt to a large number of libraries. To validate our approach, we conduct experiments including an ablation study within the NPM ecosystem. By using a ground-truth data set of 8,308 libraries, the results demonstrate a recommendation precision of 89.8% at Hit@10. Additionally, we contribute a new data set extracted from deprecation messages containing 4,070 migration rules, enriching the relatively small existing data sets in the NPM ecosystem. In summary, our approach is efficient and promising for supporting real-world, large-scale TPL recommendation and retrieval.