Please use this identifier to cite or link to this item: https://ruomoplus.lib.uom.gr/handle/8000/1778
Title: An Empirical Evaluation of the Usefulness of Word Embedding Techniques in Deep Learning-Based Vulnerability Prediction
Authors: Kalouptsoglou, Ilias 
Siavvas, Miltiadis 
Kehagias, Dionysios 
Chatzigeorgiou, Alexander 
Ampatzoglou, Apostolos 
Author Department Affiliations: Department of Applied Informatics 
Department of Applied Informatics 
Department of Applied Informatics 
Author School Affiliations: School of Information Sciences 
School of Information Sciences 
School of Information Sciences 
Keywords: Deep learning
Natural language processing
Software security
Vulnerability prediction
Word embedding vectors
Issue Date: 26-Oct-2021
Publisher: Springer
ISSN: 1865-0929
Volume Title: Security in Computer and Information Sciences
Volume: 1596 CCIS
Start page: 23
End page: 37
Conference: EuroCyberSec 2021 
Abstract: 
Software security is a critical consideration for software development companies that want to provide their customers with high-quality and dependable software. The automated detection of software vulnerabilities is a critical aspect in software security. Vulnerability prediction is a mechanism that enables the detection and mitigation of software vulnerabilities early enough in the development cycle. Recently the scientific community has dedicated a lot of effort on the design of Deep learning models based on text mining techniques. Initially, Bag-of-Words was the most promising method but recently more complex models have been proposed focusing on the sequences of instructions in the source code. Recent research endeavors have started utilizing word embedding vectors, which are widely used in text classification tasks like semantic analysis, for representing the words (i.e., code instructions) in vector format. These vectors could be trained either jointly with the other layers of the neural network, or they can be pre-trained using popular algorithms like word2vec and fast-text. In this paper, we empirically examine whether the utilization of word embedding vectors that are pre-trained separately from the vulnerability predictor could lead to more accurate vulnerability prediction models. For the purposes of the present study, a popular vulnerability dataset maintained by NIST was utilized. The results of the analysis suggest that pre-training the embedding vectors separately from the neural network leads to better vulnerability predictors with respect to their effectiveness and performance.
URI: https://ruomoplus.lib.uom.gr/handle/8000/1778
ISBN: [9783031093562]
DOI: 10.1007/978-3-031-09357-9_3
Rights: Attribution-NonCommercial-NoDerivatives 4.0 Διεθνές
Attribution-NonCommercial-NoDerivatives 4.0 Διεθνές
Corresponding Item Departments: Department of Applied Informatics
Department of Applied Informatics
Department of Applied Informatics
Appears in Collections:Conference proceedings

Show full item record

SCOPUSTM   
Citations

8
checked on Dec 13, 2024

Page view(s)

10
checked on Dec 13, 2024

Download(s)

7
checked on Dec 13, 2024

Google ScholarTM

Check

Altmetric

Altmetric


This item is licensed under a Creative Commons License Creative Commons