TY - JOUR
T1 - PatchView
T2 - Multi-modality detection of security patches
AU - Farhi, Nitzan
AU - Koenigstein, Noam
AU - Shavitt, Yuval
N1 - Publisher Copyright:
© 2025 Elsevier Ltd
PY - 2025/4
Y1 - 2025/4
N2 - Patching software become overwhelming for system administrators due to the large amounts of patch releases. Administrator should prioritize security patches to reduce the exposure to attacks, and can use for this task the Common Vulnerabilities and Exposures (CVE) system, which catalogs known security vulnerabilities in publicly released software or firmware. However, some developers choose to omit CVE publication and merely update their repositories, keeping the vulnerabilities undisclosed. Such actions leave users uninformed and potentially at risk. To this end, we present PatchView, an innovative multi-modal system tailored for the classification of commits as security patches. The system draws upon three unique data modalities associated with a commit: (1) Time-series representation of developer behavioral data within the Git repository, (2) Commit messages, and (3) The code patches. PatchView merges three single-modality sub-models, each adept at interpreting data from its designated source. A distinguishing feature of this solution is its ability to elucidate its predictions by examining the outputs of each sub-model, underscoring its interpretability. Notably, this research pioneers a language-agnostic methodology for security patch classification. Our evaluations indicate that the proposed solution can reveal concealed security patches with an accuracy of 94.52% and F1-scoreof 95.12%. The code for this paper will be made publicly available on GitHub: https://github.com/nitzanfarhi/PatchView.
AB - Patching software become overwhelming for system administrators due to the large amounts of patch releases. Administrator should prioritize security patches to reduce the exposure to attacks, and can use for this task the Common Vulnerabilities and Exposures (CVE) system, which catalogs known security vulnerabilities in publicly released software or firmware. However, some developers choose to omit CVE publication and merely update their repositories, keeping the vulnerabilities undisclosed. Such actions leave users uninformed and potentially at risk. To this end, we present PatchView, an innovative multi-modal system tailored for the classification of commits as security patches. The system draws upon three unique data modalities associated with a commit: (1) Time-series representation of developer behavioral data within the Git repository, (2) Commit messages, and (3) The code patches. PatchView merges three single-modality sub-models, each adept at interpreting data from its designated source. A distinguishing feature of this solution is its ability to elucidate its predictions by examining the outputs of each sub-model, underscoring its interpretability. Notably, this research pioneers a language-agnostic methodology for security patch classification. Our evaluations indicate that the proposed solution can reveal concealed security patches with an accuracy of 94.52% and F1-scoreof 95.12%. The code for this paper will be made publicly available on GitHub: https://github.com/nitzanfarhi/PatchView.
KW - Behavioral data
KW - Conv1D
KW - CVE
KW - Git
KW - GitHub
KW - LSTM
KW - Machine learning
UR - http://www.scopus.com/inward/record.url?scp=85216924266&partnerID=8YFLogxK
U2 - 10.1016/j.cose.2025.104356
DO - 10.1016/j.cose.2025.104356
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:85216924266
SN - 0167-4048
VL - 151
JO - Computers and Security
JF - Computers and Security
M1 - 104356
ER -