The Common-Neighbors Metric Is Noise-Robust and Reveals Substructures of Real-World Networks

Sarel Cohen, Philipp Fischbeck*, Tobias Friedrich, Martin Krejca

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Real-world networks typically display a complex structure that is hard to explain by a single model. A common approach is to partition the edges of the network into disjoint simpler structures. An important property in this context is locality—incident vertices usually have many common neighbors. This allows to classify edges into two groups, based on the number of the common neighbors of their incident vertices. Formally, this is captured by the common-neighbors (CN) metric, which forms the basis of many metrics for detecting outlier edges. Such outliers can be interpreted as noise or as a substructure. We aim to understand how useful the metric is, and empirically analyze several scenarios. We randomly insert outlier edges into real-world and generated graphs with high locality, and measure the metric accuracy for partitioning the combined edges. In addition, we use the metric to decompose real-world networks, and measure properties of the partitions. Our results show that the CN metric is a very good classifier that can reliably detect noise up to extreme levels (83% noisy edges). We also provide mathematically rigorous analyses on special random-graph models. Last, we find the CN metric consistently decomposes real-world networks into two graphs with very different structures.

Original languageEnglish
Title of host publicationAdvances in Knowledge Discovery and Data Mining - 27th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2023, Proceedings
EditorsHisashi Kashima, Tsuyoshi Ide, Wen-Chih Peng
PublisherSpringer Science and Business Media Deutschland GmbH
Pages67-79
Number of pages13
ISBN (Print)9783031333736
DOIs
StatePublished - 2023
Externally publishedYes
Event27th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2023 - Osaka, Japan
Duration: 25 May 202328 May 2023

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13935 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference27th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2023
Country/TerritoryJapan
CityOsaka
Period25/05/2328/05/23

Keywords

  • Clustering
  • Networks
  • Noise

Fingerprint

Dive into the research topics of 'The Common-Neighbors Metric Is Noise-Robust and Reveals Substructures of Real-World Networks'. Together they form a unique fingerprint.

Cite this