Header menu link for other important links
X
Empirical Study on the Distribution of Bugs in Software Systems
C.K. Shriram, , N.L. Bhanu Murthy
Published in World Scientific Publishing Co. Pte Ltd
2018
Volume: 28
   
Issue: 1
Pages: 97 - 122
Abstract
Many research studies in the past have shown that the distribution of bugs in software systems follows the Pareto principle. Some studies have also proposed the Pareto distribution (PD) to model bugs in software systems. However, several other probability distributions such as the Weibull, Bounded Generalized Pareto, Double Pareto (DP), Log Normal and Yule-Simon distributions have also been proposed and each of them has been evaluated for their fitness to model bugs in different studies. We investigate this problem further by making use of information theoretic (criterion-based) approaches to model selection by which several issues like overfitting, etc., that are prevalent in previous works, can be handled elegantly. By strengthening the model selection procedure and studying a large collection of fault data, the results are made more accurate and stable. We conduct experiments on fault data from 74 releases of various open source and proprietary software systems and observe that the DP distribution outperforms all others with statistical significance in the case of proprietary projects. For open source software systems, the top three performing distributions are DP, Bounded Generalized Pareto, Weibull models and they are significantly better than all others though there is no significant difference amongst three of them. © 2018 World Scientific Publishing Company.
About the journal
JournalInternational Journal of Software Engineering and Knowledge Engineering
PublisherWorld Scientific Publishing Co. Pte Ltd
ISSN02181940