University of Toronto, 2016.
Large-scale applications require a scalable data dissemination service with advanced filtering capabilities. We propose the use of a content-based publish/subscribe system with support for top-k filtering in the context of such applications. We focus on the problem of top-k subscription filtering, where a publication is delivered only to the k best ranked subscribers. The naive approach to perform filtering early at the publisher edge works only if complete knowledge of the subscriptions is available, which is not compatible with the well-established covering optimization in scalable content-based publish/subscribe systems. We propose an efficient rank-cover technique to reconcile top-k subscription filtering with covering. We extend the covering model to support top-k and describe a novel algorithm for forwarding subscriptions to publishers while maintaining correctness. We also establish a framework for supporting different types of ranking semantics and propose an implementation to support fairness. Finally, we compare our solutions to a baseline covering system and perform sensitivity analysis to demonstrate that our optimized rank-cover algorithm retains both covering and fairness while achieving properties advantageous to our targeted workloads. Our optimized solution is scalable and provides over 81\% of the covering benefit when k is set at 1% selectivity.
Readers who enjoyed the above work, may also like the following: