ALOGO: A Novel and Effective Framework for Online Cross-project Defect Prediction
Cross-project defect prediction (CPDP) uses the historical defect dataset collected from source projects to train a model and then applies it to the target project. However, existing CPDP methods are generally developed for offline scenario where CPDP models are fixed and cannot be updated along with the incoming labeled target instances after training. Actually, the label of target instances usually arrives online in a streaming manner which can be used to update CPDP models for better defect prediction performance on next unlabeled target instance. To bridge these gaps, we propose a novel effective online cross-project defect prediction framework named ALOGO. ALOGO includes two essential phases: offline cross-project defect prediction phase and online within-project defect prediction (WPDP) phase which are combined by an adaptive weighted adjustment mechanism. In offline CPDP phase, the global offline defect knowledge is learned by maximizing the difference between source and target datasets based on an offline CPDP model. In online WPDP phase, the local online defect knowledge is learned based on an online WPDP model. These two kinds of defect knowledge are then combined for obtaining the latest and the most valuable defect knowledge. Experimental results on 27 defect datasets show that ALOGO improves the performance over existing state-of-the-art online CPDP model by 31.2% in terms MCC and also outperforms the baseline in terms of other four well-known measures. It can be further concluded that 1) it is necessary to perform online CPDP; 2) ALOGO is a more promising alternative for online CPDP.