A sampling-based approach to accelerating queries in log management systems

Tal Wagner, Eric Schkufza, Udi Wieder

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

Log management systems are common in industry and an essential part of a system administrator's toolkit. Examples include Splunk, elk, Log Insight, Sexilog, and more. Logs in these systems are characterized by a small number of predefined fields such as timestamp and host, with the bulk of an entry being unstructured text. System administrators query these logs using a combination of range constraints over predefined fields and patterns or regular expressions over the text portion of the message. These queries are both complex and diverse. We propose a method for maintaining a subset of these logs in a much smaller database known as a sublog. Because queries are issued against a much smaller data set they run to completion quickly and avoid common scaling bottlenecks. However, the improvement in performance comes at a price. Because we only consider a subset of the original data, we are only able to provide approximate responses. Nonetheless, the reduction in accuracy is minimal and we are able to produce high-quality, high-performance results.

Original languageEnglish
Title of host publicationSPLASH Companion 2016 - Companion Proceedings of the 2016 ACM SIGPLAN International Conference on Systems, Programming, Languages and Applications
Subtitle of host publicationSoftware for Humanity
EditorsEelco Visser
PublisherAssociation for Computing Machinery, Inc
Pages37-38
Number of pages2
ISBN (Electronic)9781450344371
DOIs
StatePublished - 20 Oct 2016
Externally publishedYes
Event2016 ACM SIGPLAN International Conference on Systems, Programming, Languages and Applications: Software for Humanity, SPLASH Companion 2016 - Amsterdam, Netherlands
Duration: 30 Oct 20164 Nov 2016

Publication series

NameSPLASH Companion 2016 - Companion Proceedings of the 2016 ACM SIGPLAN International Conference on Systems, Programming, Languages and Applications: Software for Humanity

Conference

Conference2016 ACM SIGPLAN International Conference on Systems, Programming, Languages and Applications: Software for Humanity, SPLASH Companion 2016
Country/TerritoryNetherlands
CityAmsterdam
Period30/10/164/11/16

Keywords

  • Log management systems
  • Log messages
  • Stratified sampling

Fingerprint

Dive into the research topics of 'A sampling-based approach to accelerating queries in log management systems'. Together they form a unique fingerprint.

Cite this