ModularEvo: Evolving Multi-Task Models via Neural Network Modularization and Composition
Training a general multi-task deep neural network (DNN) model, such as a large language model, and deploying it across diverse downstream tasks has become a common practice. In long-term deployment scenarios, downstream tasks can change over time, such as new data distributions and requirements, leading to the fine-tuning of the model accordingly, i.e., evolving the model. However, traditional full-parameter fine-tuning methods adapt the model to individual tasks, resulting in degradation of the original knowledge. Although parameter-efficient fine-tuning methods could mitigate this problem, they still isolate new knowledge in external, separate parameters. As a result, the base model gains little cumulative benefit from downstream updates. These limitations stem from the indiscriminate model deployment and fine-tuning.
Inspired by modular design principles in software engineering, we propose ModularEvo, a framework that enables on-demand deployment and co-evolution of multi-task models and modules across diverse downstream tasks. ModularEvo first decomposes the model into task-specific modules, each retaining a subset of relevant weights and functionality. These modules, instead of the entire model, are deployed on downstream tasks on demand. During long-term deployment, each module is independently optimized to adjust to the change of the corresponding task. Unlike conventional fine-tuning methods, ModularEvo applies modular fine-tuning to update only the task-relevant weights within modules. Furthermore, new knowledge acquired by modules is periodically integrated into the model, enabling the co-evolution of both the model and modules.
We evaluate ModularEvo through extensive experiments on three Transformer models and six downstream tasks involving both classification and generation tasks. Results demonstrate the effectiveness of ModularEvo in model performance and inference efficiency in evolution scenarios. Compared to state-of-the-art baselines, ModularEvo achieves an absolute performance gain of 2.34% in multi-round evolution scenarios, and a 2.22 times speedup in inference.