Software Defect Prediction and Bugs Classification

No Thumbnail Available
Date
2020
Journal Title
Journal ISSN
Volume Title
Publisher
Uva Wellassa University of Sri Lanka
Abstract
Defect prediction empathizes a main role in the Software Development Life Cycle. Having defects in source code is unavoidable but identifying and reducing the bugs in source code in early stages save time and effort. The quality of the source code is primarily measured based on the defect rate. Software defects have a strong relationship with software metrics. As a developer once they build the code, it redirects to the quality engineers to test the code, and if there is any fault with the software, they will send it back to the developers with their comments. Since the software was built by many developers, each developer has to go through the code to find the faultiness. This process will have a significant efficiency improvement if the developer responsible for the bug is identified at the beginning. This tool provides a practical solution, where once we get the feedback from the quality assurance engineers, tool labels the developers(n), and redirect all the comments to the relevant bucket. Then the developers can easily identify their faults and fix bugs. To understand the status of the code, researchers collected several metrics which are related to defect prediction such as Cyclomatic Complexity values and Halstead Complexity values, then using the Principle Component Analysis, it identifies the most relevant software metrics for defect prediction and builds the dataset. Using the dataset researchers developed a machine learning model to predict the code status, Whether the code is in Good, Moderate, or Weak Level. Natural Language Processing used to analyse the Git issues, with the aid of the Latent Dirichlet allocation algorithm, it is based on clustering and create needed categories for a given input(n). Once the user gives the link of the source code, the tool identifies the defect rate, responsible developer for each bug, most committed authors, and the frequencies of most used words. The result shows that the tool solves the practical problem more accurately in the programming environment. Keywords: Defect Prediction, Software Metrics, Git Issues, Defect rate, Machine
Description
Keywords
Computer Science, Information Science, Computing and Information Management, Software
Citation