TY - GEN
T1 - The Common-Neighbors Metric Is Noise-Robust and Reveals Substructures of Real-World Networks
AU - Cohen, Sarel
AU - Fischbeck, Philipp
AU - Friedrich, Tobias
AU - Krejca, Martin
N1 - Publisher Copyright:
© 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2023
Y1 - 2023
N2 - Real-world networks typically display a complex structure that is hard to explain by a single model. A common approach is to partition the edges of the network into disjoint simpler structures. An important property in this context is locality—incident vertices usually have many common neighbors. This allows to classify edges into two groups, based on the number of the common neighbors of their incident vertices. Formally, this is captured by the common-neighbors (CN) metric, which forms the basis of many metrics for detecting outlier edges. Such outliers can be interpreted as noise or as a substructure. We aim to understand how useful the metric is, and empirically analyze several scenarios. We randomly insert outlier edges into real-world and generated graphs with high locality, and measure the metric accuracy for partitioning the combined edges. In addition, we use the metric to decompose real-world networks, and measure properties of the partitions. Our results show that the CN metric is a very good classifier that can reliably detect noise up to extreme levels (83% noisy edges). We also provide mathematically rigorous analyses on special random-graph models. Last, we find the CN metric consistently decomposes real-world networks into two graphs with very different structures.
AB - Real-world networks typically display a complex structure that is hard to explain by a single model. A common approach is to partition the edges of the network into disjoint simpler structures. An important property in this context is locality—incident vertices usually have many common neighbors. This allows to classify edges into two groups, based on the number of the common neighbors of their incident vertices. Formally, this is captured by the common-neighbors (CN) metric, which forms the basis of many metrics for detecting outlier edges. Such outliers can be interpreted as noise or as a substructure. We aim to understand how useful the metric is, and empirically analyze several scenarios. We randomly insert outlier edges into real-world and generated graphs with high locality, and measure the metric accuracy for partitioning the combined edges. In addition, we use the metric to decompose real-world networks, and measure properties of the partitions. Our results show that the CN metric is a very good classifier that can reliably detect noise up to extreme levels (83% noisy edges). We also provide mathematically rigorous analyses on special random-graph models. Last, we find the CN metric consistently decomposes real-world networks into two graphs with very different structures.
KW - Clustering
KW - Networks
KW - Noise
UR - http://www.scopus.com/inward/record.url?scp=85173569335&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-33374-3_6
DO - 10.1007/978-3-031-33374-3_6
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:85173569335
SN - 9783031333736
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 67
EP - 79
BT - Advances in Knowledge Discovery and Data Mining - 27th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2023, Proceedings
A2 - Kashima, Hisashi
A2 - Ide, Tsuyoshi
A2 - Peng, Wen-Chih
PB - Springer Science and Business Media Deutschland GmbH
T2 - 27th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2023
Y2 - 25 May 2023 through 28 May 2023
ER -