Please use this identifier to cite or link to this item:
https://ruomoplus.lib.uom.gr/handle/8000/2037| Title: | Vulnerability Classification on Source Code Using Text Mining and Deep Learning Techniques | Authors: | Kalouptsoglou, Ilias Siavvas, Miltiadis Ampatzoglou, Apostolos Kehagias, Dionysios Chatzigeorgiou, Alexander |
Author Department Affiliations: | Department of Applied Informatics Department of Applied Informatics Department of Applied Informatics |
Author School Affiliations: | School of Information Sciences School of Information Sciences School of Information Sciences |
Subjects: | FRASCATI__Natural sciences__Computer and information sciences FRASCATI__Engineering and technology__Electrical engineering, Electronic engineering, Information engineering |
Keywords: | contextual word embedding large language models natural language processing security testing transfer learning vulnerability classification |
Issue Date: | 29-Oct-2024 | Publisher: | IEEE | Volume Title: | Proceedings of the 2024 IEEE 24th International Conference on Software Quality, Reliability, and Security Companion (QRS-C) | Start page: | 47 | End page: | 56 | Conference: | 2024 IEEE 24th International Conference on Software Quality, Reliability, and Security Companion (QRS-C) | Abstract: | Nowadays, security testing is an integral part of the testing activities during the software development life-cycle. Over the years, various techniques have been proposed to identify security issues in the source code, especially vulnerabilities, which can be exploited and cause severe damages. Recently, Machine Learning (ML) techniques capable of predicting vulnerable software components and indicating high-risk areas have appeared, among others, accelerating the effort demanding and time consuming process of vulnerability localization. For effective subsequent vulnerability elimination, there is a need for automating the process of labeling detected vulnerabilities in vulnerability categories i.e., identifying the type of the vulnerability. Several techniques have been proposed over the years for automating the labeling process of vulnerabilities. However, the vast majority of the proposed methods attempt to identify the type of vulnerabilities based on their textual description that is provided by experts, such as the description provided by the vulnerability report in the National Vulnerability Database, and not on their actual source code, hindering their full automation and the vulnerability categorization from the software testing phase. This work examines the vulnerability classification directly from the source code during the vulnerability detection step. Moreover, this way, a vulnerability detection method will be able to provide complete information and interpretation of its findings. Leveraging the advances in the field of Artificial Intelligence and Natural Language Processing, we construct and compare several multi-class classification models for categorizing vulnerable code snippets. The results highlight the importance of the context-aware embeddings of the pre-trained Transformer-based models, as well as the significance of transfer learning from a programming language-related domain. |
URI: | https://ruomoplus.lib.uom.gr/handle/8000/2037 | ISBN: | [9798350365658] | DOI: | 10.1109/QRS-C63300.2024.00017 | Rights: | CC0 1.0 Παγκόσμια Attribution-NonCommercial-NoDerivatives 4.0 Διεθνές |
Corresponding Item Departments: | Department of Applied Informatics Department of Applied Informatics Department of Applied Informatics |
| Appears in Collections: | Conference proceedings |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| kalouptsoglou2024qrs.pdf | 313,86 kB | Adobe PDF | View/Open |
SCOPUSTM
Citations
4
checked on Apr 13, 2026
Page view(s)
78
checked on Apr 18, 2026
Download(s)
78
checked on Apr 18, 2026
Google ScholarTM
Check
Altmetric
Altmetric
This item is licensed under a Creative Commons License