TY - GEN
T1 - Statistical log differencing
AU - Bao, Lingfeng
AU - Busany, Nimrod
AU - Lo, David
AU - Maoz, Shahar
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/11
Y1 - 2019/11
N2 - Recent works have considered the problem of log differencing: given two or more system's execution logs, output a model of their differences. Log differencing has potential applications in software evolution, testing, and security. In this paper we present statistical log differencing, which accounts for frequencies of behaviors found in the logs. We present two algorithms, s2KDiff for differencing two logs, and snKDiff, for differencing of many logs at once, both presenting their results over a single inferred model. A unique aspect of our algorithms is their use of statistical hypothesis testing: we let the engineer control the sensitivity of the analysis by setting the target distance between probabilities and the statistical significance value, and report only (and all) the statistically significant differences. Our evaluation shows the effectiveness of our work in terms of soundness, completeness, and performance. It also demonstrates its effectiveness compared to previous work via a user-study and its potential applications via a case study using real-world logs.
AB - Recent works have considered the problem of log differencing: given two or more system's execution logs, output a model of their differences. Log differencing has potential applications in software evolution, testing, and security. In this paper we present statistical log differencing, which accounts for frequencies of behaviors found in the logs. We present two algorithms, s2KDiff for differencing two logs, and snKDiff, for differencing of many logs at once, both presenting their results over a single inferred model. A unique aspect of our algorithms is their use of statistical hypothesis testing: we let the engineer control the sensitivity of the analysis by setting the target distance between probabilities and the statistical significance value, and report only (and all) the statistically significant differences. Our evaluation shows the effectiveness of our work in terms of soundness, completeness, and performance. It also demonstrates its effectiveness compared to previous work via a user-study and its potential applications via a case study using real-world logs.
KW - Log analysis
KW - Model inference
UR - http://www.scopus.com/inward/record.url?scp=85078894849&partnerID=8YFLogxK
U2 - 10.1109/ASE.2019.00084
DO - 10.1109/ASE.2019.00084
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:85078894849
T3 - Proceedings - 2019 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019
SP - 851
EP - 862
BT - Proceedings - 2019 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019
Y2 - 10 November 2019 through 15 November 2019
ER -