University of Toronto, 2012.
The amount of content served on social networks can overwhelm users, who must sift through the data for relevant information. To facilitate users, we develop and implement dissemination of ranked data in social networks. Users can specify a scoring function which ranks incoming events. The users are then notified only of the top-k most relevant events according to their interests. Although top-k computation can be performed centrally at the user, the size of the event stream can constitute a significant bottleneck. Our approach distributes the top-k computation to reduce the number of events flowing through the system. The solution operates on an overlay network model, such as pub-sub broker overlay networks. Instead of forwarding all events, nodes closer to the sources can perform top-k computations on the collected stream and send only matching events. Nodes closer to the sinks can then merge the top-k streams into the final result. We use a hybrid chunk-based approach which alternates between full forwarding of events and pre-computed top-k streams at upstream nodes. This hybrid approach allows us to maintain correctness by recreating a possible top-k interleaving of the entire event stream. Our results show that our buffered solution exhibits significant traffic reduction in the overlay while maintaining latency comparable to the centralized approach. We also offer a sensitivity analysis which demonstrates the impact of various parameters on the performance of the solution.