Estimating peer similarity using distance of shared files

Yuval Shavitt, Ela Weinsberg, Udi Weinsberg

Research output: Contribution to conferencePaperpeer-review


Peer-to-Peer (p2p) networks are used by millions of users for sharing content. As these networks become ever more popular, it becomes increasingly difficult to find useful content in the abundance of shared files. Modern p2p networks and similar social services must adopt new methods to help users efficiently locate content, and to this end approximate meta-data search and recommendation systems are utilized. However, meta-data is often missing or wrong, and recommender systems are not fitted to handle p2p networks due to inherent difficulties such as implicit ranking, noise in user generated content and the extreme dimensions and sparseness of the network. This paper attempts to bridge this gap by suggesting a new metric for peer similarity, which can be used to improve content search and recommendation in large scale p2p networks and semi-centralized services, such as p2p IPTV. Unlike commonly used vector distance functions, which is shown to be unfitted for p2p networks due to low overlap between peers, this work leverages a file similarity graph for estimating the similarity between peers that have little or no overlap of shared files. Using 100k peers sharing over 500k songs in the Gnutella network, we show the advantages of the proposed metric over commonly used geographical locality and vector distance measures.

Original languageEnglish
StatePublished - 2010
Event9th International Workshop on Peer-to-Peer Systems, IPTPS 2010 - San Jose, United States
Duration: 27 Apr 2010 → …


Conference9th International Workshop on Peer-to-Peer Systems, IPTPS 2010
Country/TerritoryUnited States
CitySan Jose
Period27/04/10 → …


Dive into the research topics of 'Estimating peer similarity using distance of shared files'. Together they form a unique fingerprint.

Cite this