Please use this identifier to cite or link to this item:
https://ruomoplus.lib.uom.gr/handle/8000/2034| Title: | Local and Global Explainability for Technical Debt Identification | Authors: | Tsoukalas, Dimitrios Mittas, Nikolaos Arvanitou, Elvira-Maria Ampatzoglou, Apostolos Chatzigeorgiou, Alexander Kehagias, Dionysios |
Author Department Affiliations: | Department of Applied Informatics Department of Applied Informatics Department of Applied Informatics Department of Applied Informatics |
Author School Affiliations: | School of Information Sciences School of Information Sciences School of Information Sciences School of Information Sciences |
Subjects: | FRASCATI__Natural sciences FRASCATI__Engineering and technology__Electrical engineering, Electronic engineering, Information engineering |
Keywords: | explainable AI SHAP software metrics software quality Technical debt technical debt identification |
Issue Date: | 1-Jan-2024 | Publisher: | IEEE | Journal: | IEEE Transactions on Software Engineering | ISSN: | 0098-5589 | Volume: | 50 | Issue: | 8 | Start page: | 2110 | End page: | 2123 | Abstract: | In recent years, we have witnessed an important increase in research focusing on how machine learning (ML) techniques can be used for software quality assessment and improvement. However, the derived methodologies and tools lack transparency, due to the black-box nature of the employed machine learning models, leading to decreased trust in their results. To address this shortcoming, in this paper we extend the state-of-the-art and-practice by building explainable AI models on top of machine learning ones, to interpret the factors (i.e. software metrics) that constitute a module as in risk of having high technical debt (HIGH TD), to obtain thresholds for metric scores that are alerting for poor maintainability, and finally, we dig further to achieve local interpretation that explains the specific problems of each module, pinpointing to specific opportunities for improvement during TD management. To achieve this goal, we have developed project-specific classifiers (characterizing modules as HIGH and NOT-HIGH TD) for 21 open-source projects, and we explain their rationale using the SHapley Additive exPlanation (SHAP) analysis. Based on our analysis, complexity, comments ratio, cohesion, nesting of control flow statements, coupling, refactoring activity, and code churn are the most important reasons for characterizing classes as in HIGH TD risk. The analysis is complemented with global and local means of interpretation, such as metric thresholds and case-by-case reasoning for characterizing a class as in-risk of having HIGH TD. The results of the study are compared against the state-of-the-art and are interpreted from the point of view of both researchers and practitioners. |
URI: | https://ruomoplus.lib.uom.gr/handle/8000/2034 | DOI: | 10.1109/TSE.2024.3422427 | Rights: | CC0 1.0 Παγκόσμια CC0 1.0 Παγκόσμια Attribution-NonCommercial-NoDerivatives 4.0 Διεθνές |
Corresponding Item Departments: | Department of Applied Informatics Department of Applied Informatics Department of Applied Informatics Department of Applied Informatics |
| Appears in Collections: | Articles |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| tsoukalas2023tse.pdf | 1,79 MB | Adobe PDF | View/Open |
SCOPUSTM
Citations
9
checked on Apr 13, 2026
Page view(s)
112
checked on Apr 18, 2026
Download(s)
83
checked on Apr 18, 2026
Google ScholarTM
Check
Altmetric
Altmetric
This item is licensed under a Creative Commons License