Algorithms and estimators for accurate summarization of internet traffic

Edith Cohen*, Nick Duffield, Haim Kaplan, Carsten Lund, Mikkel Thorup

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

16 Scopus citations

Abstract

Statistical summaries of traffic in IP networks are at the heart of network operation and are used to recover information on arbitrary subpopulations of flows. It is therefore of great importance to collect the most accurate and informative summaries given the router's resource constraints. Cisco's sampled NetFlow, based on aggregating a sampled packet stream into flows, is the most widely deployed such system. We observe two sources of inefficiency in current methods. Firstly, a single parameter (the sampling rate) is used to control utilization of both memory and processing/access speed, which means that it has to be set according to the bottleneck resource. Secondly, the unbiased estimators are applicable to summaries that in effect are collected through uneven use of resources during the measurement period (information from the earlier part of the measurement period is either not collected at all and fewer counter are utilized or discarded when performing a sampling rate adaptation). We develop algorithms that collect more informative summaries through an even and more efficient use of available resources. The heart of our approach is a novel derivation of unbiased estimators that use these more informative counts. We show how to efficiently compute these estimators and prove analytically that they are superior (have smaller variance on all packet streams and subpopulations) to previous approaches. Simulations on Pareto distributions and IP flow data show that the new summaries provide significantly more accurate estimates. We provide an implementation design that can be efficiently deployed at routers.

Original languageEnglish
Title of host publicationIMC'07
Subtitle of host publicationProceedings of the 2007 ACM SIGCOMM Internet Measurement Conference
Pages265-278
Number of pages14
DOIs
StatePublished - 2007
EventIMC'07: 2007 7th ACM SIGCOMM Internet Measurement Conference - San Diego, CA, United States
Duration: 24 Oct 200726 Oct 2007

Publication series

NameProceedings of the ACM SIGCOMM Internet Measurement Conference, IMC

Conference

ConferenceIMC'07: 2007 7th ACM SIGCOMM Internet Measurement Conference
Country/TerritoryUnited States
CitySan Diego, CA
Period24/10/0726/10/07

Keywords

  • Data streams
  • IP flows
  • NetFlow
  • Network management
  • Sketches
  • Subpopulation queries

Fingerprint

Dive into the research topics of 'Algorithms and estimators for accurate summarization of internet traffic'. Together they form a unique fingerprint.

Cite this