TY - GEN
T1 - Field-sensitive program dependence analysis
AU - Litvak, Shay
AU - Dor, Nurit
AU - Bodik, Rastislav
AU - Rinetzky, Noam
AU - Sagiv, Mooly
PY - 2010
Y1 - 2010
N2 - Statement st transitively depends on statement stseed if the execution of stseed may affect the execution of st. Computing transitive program dependences is a fundamental operation in many automatic software analysis tools. Existing tools find it challenging to compute transitive dependences for programs manipulating large aggregate structure variables, and their limitations adversely affect analysis of certain important classes of software systems, e.g., large-scale enterprise resource planning (ERP) systems. This paper presents an efficient conservative interprocedural static analysis algorithm for computing field-sensitive transitive program dependences in the presence of large aggregate structure variables. Our key insight is that program dependences coming from operations on whole substructures can be precisely (i.e., field-sensitively) represented at the granularity of substructures instead of individual fields. Technically, we adapt the interval domain to concisely record dependences between multiple pairs of fields of aggregate structure variables by exploiting the fields' spatial arrangement. We prove that our algorithm is as precise as any algorithm which works at the granularity of individual fields, the most-precise known approach for this problem. Our empirical study, in which we analyzed industrial ERP programs with over 100,000 lines of code in average, shows significant improvements in both the running times and memory consumption over existing approaches: The baseline is an efficient field-insensitive whole-structure that incurs a 62% false error rate. An atomization-based algorithm, which disassemble every aggregate structure variable into the collection of its individual fields, can remove all these false errors at the cost of doubling the average analysis time, from 30 to 60 minutes. In contrast, our new precise algorithm removes all false errors by increasing the time only to 35 minutes. In terms of memory consumption, our algorithm increases the footprint by less than 10%, compared to 50% overhead of the atomizing algorithm.
AB - Statement st transitively depends on statement stseed if the execution of stseed may affect the execution of st. Computing transitive program dependences is a fundamental operation in many automatic software analysis tools. Existing tools find it challenging to compute transitive dependences for programs manipulating large aggregate structure variables, and their limitations adversely affect analysis of certain important classes of software systems, e.g., large-scale enterprise resource planning (ERP) systems. This paper presents an efficient conservative interprocedural static analysis algorithm for computing field-sensitive transitive program dependences in the presence of large aggregate structure variables. Our key insight is that program dependences coming from operations on whole substructures can be precisely (i.e., field-sensitively) represented at the granularity of substructures instead of individual fields. Technically, we adapt the interval domain to concisely record dependences between multiple pairs of fields of aggregate structure variables by exploiting the fields' spatial arrangement. We prove that our algorithm is as precise as any algorithm which works at the granularity of individual fields, the most-precise known approach for this problem. Our empirical study, in which we analyzed industrial ERP programs with over 100,000 lines of code in average, shows significant improvements in both the running times and memory consumption over existing approaches: The baseline is an efficient field-insensitive whole-structure that incurs a 62% false error rate. An atomization-based algorithm, which disassemble every aggregate structure variable into the collection of its individual fields, can remove all these false errors at the cost of doubling the average analysis time, from 30 to 60 minutes. In contrast, our new precise algorithm removes all false errors by increasing the time only to 35 minutes. In terms of memory consumption, our algorithm increases the footprint by less than 10%, compared to 50% overhead of the atomizing algorithm.
KW - adas
KW - aggregate structure variables
KW - erp
KW - field-sensitivity
KW - transitive program dependences
UR - http://www.scopus.com/inward/record.url?scp=78751499790&partnerID=8YFLogxK
U2 - 10.1145/1882291.1882334
DO - 10.1145/1882291.1882334
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:78751499790
SN - 9781605587912
T3 - Proceedings of the ACM SIGSOFT Symposium on the Foundations of Software Engineering
SP - 287
EP - 296
BT - Proceedings of the 18th ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE-18
T2 - 18th ACM SIGSOFT International Symposium on the Foundations of Software Engineering, FSE-18
Y2 - 7 November 2010 through 11 November 2010
ER -