Please use this identifier to cite or link to this item: https://ruomoplus.lib.uom.gr/handle/8000/2034
Title: Local and Global Explainability for Technical Debt Identification
Authors: Tsoukalas, Dimitrios 
Mittas, Nikolaos 
Arvanitou, Elvira-Maria 
Ampatzoglou, Apostolos 
Chatzigeorgiou, Alexander 
Kehagias, Dionysios 
Author Department Affiliations: Department of Applied Informatics 
Department of Applied Informatics 
Department of Applied Informatics 
Department of Applied Informatics 
Author School Affiliations: School of Information Sciences 
School of Information Sciences 
School of Information Sciences 
School of Information Sciences 
Subjects: FRASCATI__Natural sciences
FRASCATI__Engineering and technology__Electrical engineering, Electronic engineering, Information engineering
Keywords: explainable AI
SHAP
software metrics
software quality
Technical debt
technical debt identification
Issue Date: 1-Jan-2024
Publisher: IEEE
Journal: IEEE Transactions on Software Engineering 
ISSN: 0098-5589
Volume: 50
Issue: 8
Start page: 2110
End page: 2123
Abstract: 
In recent years, we have witnessed an important increase in research focusing on how machine learning (ML) techniques can be used for software quality assessment and improvement. However, the derived methodologies and tools lack transparency, due to the black-box nature of the employed machine learning models, leading to decreased trust in their results. To address this shortcoming, in this paper we extend the state-of-the-art and-practice by building explainable AI models on top of machine learning ones, to interpret the factors (i.e. software metrics) that constitute a module as in risk of having high technical debt (HIGH TD), to obtain thresholds for metric scores that are alerting for poor maintainability, and finally, we dig further to achieve local interpretation that explains the specific problems of each module, pinpointing to specific opportunities for improvement during TD management. To achieve this goal, we have developed project-specific classifiers (characterizing modules as HIGH and NOT-HIGH TD) for 21 open-source projects, and we explain their rationale using the SHapley Additive exPlanation (SHAP) analysis. Based on our analysis, complexity, comments ratio, cohesion, nesting of control flow statements, coupling, refactoring activity, and code churn are the most important reasons for characterizing classes as in HIGH TD risk. The analysis is complemented with global and local means of interpretation, such as metric thresholds and case-by-case reasoning for characterizing a class as in-risk of having HIGH TD. The results of the study are compared against the state-of-the-art and are interpreted from the point of view of both researchers and practitioners.
URI: https://ruomoplus.lib.uom.gr/handle/8000/2034
DOI: 10.1109/TSE.2024.3422427
Rights: CC0 1.0 Παγκόσμια
CC0 1.0 Παγκόσμια
Attribution-NonCommercial-NoDerivatives 4.0 Διεθνές
Corresponding Item Departments: Department of Applied Informatics
Department of Applied Informatics
Department of Applied Informatics
Department of Applied Informatics
Appears in Collections:Articles

Files in This Item:
File Description SizeFormat
tsoukalas2023tse.pdf1,79 MBAdobe PDF
View/Open
Show full item record

SCOPUSTM   
Citations

9
checked on Apr 13, 2026

Page view(s)

112
checked on Apr 18, 2026

Download(s)

83
checked on Apr 18, 2026

Google ScholarTM

Check

Altmetric

Altmetric


This item is licensed under a Creative Commons License Creative Commons