TY - JOUR
T1 - Website categorization via design attribute learning
AU - Cohen, Doron
AU - Naim, Or
AU - Toch, Eran
AU - Ben-Gal, Irad
N1 - Publisher Copyright:
© 2021 Elsevier Ltd
PY - 2021/8
Y1 - 2021/8
N2 - Malicious software (malware) is a challenging cybersecurity threat, as it is often bundled with legitimate software and downloaded by naïve users. A significant source of malware downloads is via crack websites that are used to circumvent copyright protection mechanisms. Crack websites often change URLs and IPs to avoid automatic detection; however, in many cases, they preserve specific visual designs that signal the website's function to potential users (such as particular colors, text fonts, shapes, and sizes.). Website design features are numerous, have high dimensionality and complicated interactions, making categorization challenging. This study shows that straightforward machine learning models for categorizing Crack and Malicious websites can considerably benefit from using design features. We report on two experiments based on unbalanced datasets and show that classification by using design features can reach a categorization accuracy of over 90% with an F1-score over 77% in some instances. Finally, we discuss the results in the context of developing intelligent security mechanisms.
AB - Malicious software (malware) is a challenging cybersecurity threat, as it is often bundled with legitimate software and downloaded by naïve users. A significant source of malware downloads is via crack websites that are used to circumvent copyright protection mechanisms. Crack websites often change URLs and IPs to avoid automatic detection; however, in many cases, they preserve specific visual designs that signal the website's function to potential users (such as particular colors, text fonts, shapes, and sizes.). Website design features are numerous, have high dimensionality and complicated interactions, making categorization challenging. This study shows that straightforward machine learning models for categorizing Crack and Malicious websites can considerably benefit from using design features. We report on two experiments based on unbalanced datasets and show that classification by using design features can reach a categorization accuracy of over 90% with an F1-score over 77% in some instances. Finally, we discuss the results in the context of developing intelligent security mechanisms.
KW - Crack websites
KW - Cyber security
KW - Human computer interaction
KW - Malware
KW - Online learning
KW - Website categorization
KW - Website design elements
UR - http://www.scopus.com/inward/record.url?scp=85106666259&partnerID=8YFLogxK
U2 - 10.1016/j.cose.2021.102312
DO - 10.1016/j.cose.2021.102312
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:85106666259
SN - 0167-4048
VL - 107
JO - Computers and Security
JF - Computers and Security
M1 - 102312
ER -