TY - GEN
T1 - Using temporal and semantic developer-level information to predict maintenance activity profiles
AU - Levin, Stanislav
AU - Yehudai, Amiram
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2017/1/12
Y1 - 2017/1/12
N2 - Predictive models for software projects' characteristics have been traditionally based on project-level metrics, employing only little developer-level information, or none at all. In this work we suggest novel metrics that capture temporal and semantic developer-level information collected on a per developer basis. To address the scalability challenges involved in computing these metrics for each and every developer for a large number of source code repositories, we have built a designated repository mining platform. This platform was used to create a metrics dataset based on processing nearly 1000 highly popular open source GitHub repositories, consisting of 147 million LOC, and maintained by 30,000 developers. The computed metrics were then employed to predict the corrective, perfective, and adaptive maintenance activity profiles identified in previous works. Our results show both strong correlation and promising predictive power with R2 values of 0.83, 0.64, and 0.75. We also show how these results may help project managers to detect anomalies in the development process and to build better development teams. In addition, the platform we built has the potential to yield further predictive models leveraging developer-level metrics at scale.
AB - Predictive models for software projects' characteristics have been traditionally based on project-level metrics, employing only little developer-level information, or none at all. In this work we suggest novel metrics that capture temporal and semantic developer-level information collected on a per developer basis. To address the scalability challenges involved in computing these metrics for each and every developer for a large number of source code repositories, we have built a designated repository mining platform. This platform was used to create a metrics dataset based on processing nearly 1000 highly popular open source GitHub repositories, consisting of 147 million LOC, and maintained by 30,000 developers. The computed metrics were then employed to predict the corrective, perfective, and adaptive maintenance activity profiles identified in previous works. Our results show both strong correlation and promising predictive power with R2 values of 0.83, 0.64, and 0.75. We also show how these results may help project managers to detect anomalies in the development process and to build better development teams. In addition, the platform we built has the potential to yield further predictive models leveraging developer-level metrics at scale.
KW - Human factors
KW - Mining software repositories
KW - Predictive models
KW - Software maintenance
KW - Software metrics
UR - http://www.scopus.com/inward/record.url?scp=85013056311&partnerID=8YFLogxK
U2 - 10.1109/ICSME.2016.21
DO - 10.1109/ICSME.2016.21
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:85013056311
T3 - Proceedings - 2016 IEEE International Conference on Software Maintenance and Evolution, ICSME 2016
SP - 463
EP - 467
BT - Proceedings - 2016 IEEE International Conference on Software Maintenance and Evolution, ICSME 2016
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 2 October 2016 through 10 October 2016
ER -