LogOnline: A Semi-supervised Log-based Anomaly Detector Aided with Online Learning MechanismRecorded talk
Logs are prevalent in modern cloud systems and serve as a valuable source of information for system maintenance. Over the years, a lot of research and industrial efforts have been devoted to the field of log-based anomaly detection. Through analyzing the limitations of existing approaches, we find that most of them still suffer from practical issues and are thus hard to be applied in real-world scenarios. For example, supervised approaches are dependent on a large amount of labeled log data for training, which can require much manual labeling effort. Besides, log instability, which is a pervasive issue in real-world systems, poses great challenge to existing methods, especially under the presence of many dissimilar new log events. To overcome these problems, we propose LogOnline, which is a semi supervised anomaly detector aided with online learning mechanism. The semi-supervised nature of LogOnline makes it able to get rid of the erroneous and time-consuming manual labeling of log data. Based on our proposed online learning mechanism, LogOnline can learn the normal sequence patterns continuously as new log sequences emerge, thus staying robust to unstable log data. Unlike previous works, the proposed online learning mechanism requires no labeled log data nor human intervention in the process. We have evaluated LogOnline on two widely used public datasets, and the experimental results demonstrate the effectiveness of LogOnline. In particular, LogOnline achieves a comparable result with the studied supervised approaches, outperforming all semi-supervised counterparts. When the log instability issue is more common, LogOnline exhibits the best performance over all compared approaches, further confirming its practicability.