Using Rank Aggregation in Continuously Answering SPARQL Queries on Streaming and Quasi-static Linked Data. (Experience Paper)
Web applications that combine dynamic data stream with distributed background data are getting a growing attention in recent years. Answering in a timely fashion, i.e., reactiveness, is one of the most important performance indicators for those applications. The Semantic Web community showed that RDF Stream Processing (RSP) is an adequate framework to develop this type of applications. However, RSP engines may lose their reactiveness due to the time necessary to access the background data when it is distributed over the Web. State-of-the-art RSP engines remain reactive using a local replica of the background data, but it progressively becomes stale if not updated to reflect the changes in the remote background data. For this reason, recently, the RSP community has investigated maintenance policies of the local replica that guarantee reactiveness while maximizing the freshness of the replica. Previous works simplified the problem with several assumptions. In this paper, we investigate how to remove some of those simplification assumptions. In particular, we target a class of queries for which multiple policies may be used simultaneously and we show that rank aggregation can be effectively used to fairly consider their alternative suggestions. We provide extensive empirical evidence that rank aggregation is key to move a step forward to the practical solution of this problem in the RSP context.