TY - GEN
T1 - Boosting topic-based publish-subscribe systems with dynamic clustering
AU - Milo, Tova
AU - Zur, Tal
AU - Verbin, Elad
PY - 2007
Y1 - 2007
N2 - We consider in this paper a class of Publish-Subscribe (pub-sub) systems called topic-based systems, where users subscribe to topics and are notified on events that belong to those subscribed topics. With the recent flourishing of RSS news syndication, these systems are regaining popularity and are raising new challenging problems. In most of the modern topics-based systems, the events in each topic are delivered to the subscribers via a supporting, distributed, data structure (typically a multicast tree). Since peers in the network may come and go frequently, this supporting structure must be continuously maintained so that "holes" do not disrupt the events delivery. The dissemination of events in each topic thus incurs two main costs: (1) the actual transmission cost for the topic events,and (2) the maintenance cost for its supporting structure. This maintenance overhead becomes particularly dominating when a pub-sub system supports a large number of topics with moderate event frequency; a typical scenario in nowadays news syndication scene. The goal of this paper is to devise a method for reducing this maintenance overhead to the minimum. Our aim is not to invent yet another topic-based pub-sub system, but rather to develop a generic technique for better utilization of existing platforms. Our solution is based on a novel distributed clustering algorithm that utilizes correlations between user subscriptions to dynamically group topics together, into virtual topics (called topic-clusters), andt hereby unifies their supporting structures and reduces costs. Our technique continuously adapts the topic-clusters and the user subscriptions to the system state, and incurs only very minimal overhead. We have implemented our solution in the Tamara pub-sub system. Our experimental study shows this approach to be extremely effective, improving the performance by an order of magnitude.
AB - We consider in this paper a class of Publish-Subscribe (pub-sub) systems called topic-based systems, where users subscribe to topics and are notified on events that belong to those subscribed topics. With the recent flourishing of RSS news syndication, these systems are regaining popularity and are raising new challenging problems. In most of the modern topics-based systems, the events in each topic are delivered to the subscribers via a supporting, distributed, data structure (typically a multicast tree). Since peers in the network may come and go frequently, this supporting structure must be continuously maintained so that "holes" do not disrupt the events delivery. The dissemination of events in each topic thus incurs two main costs: (1) the actual transmission cost for the topic events,and (2) the maintenance cost for its supporting structure. This maintenance overhead becomes particularly dominating when a pub-sub system supports a large number of topics with moderate event frequency; a typical scenario in nowadays news syndication scene. The goal of this paper is to devise a method for reducing this maintenance overhead to the minimum. Our aim is not to invent yet another topic-based pub-sub system, but rather to develop a generic technique for better utilization of existing platforms. Our solution is based on a novel distributed clustering algorithm that utilizes correlations between user subscriptions to dynamically group topics together, into virtual topics (called topic-clusters), andt hereby unifies their supporting structures and reduces costs. Our technique continuously adapts the topic-clusters and the user subscriptions to the system state, and incurs only very minimal overhead. We have implemented our solution in the Tamara pub-sub system. Our experimental study shows this approach to be extremely effective, improving the performance by an order of magnitude.
KW - Dynamic clustering
KW - Peer-to-peer
KW - Publish-subscribe
UR - http://www.scopus.com/inward/record.url?scp=35448978777&partnerID=8YFLogxK
U2 - 10.1145/1247480.1247563
DO - 10.1145/1247480.1247563
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:35448978777
SN - 1595936866
SN - 9781595936868
T3 - Proceedings of the ACM SIGMOD International Conference on Management of Data
SP - 749
EP - 760
BT - SIGMOD 2007
Y2 - 12 June 2007 through 14 June 2007
ER -