Header menu link for other important links
X
Dealing with noise problem in machine learning data-sets: A systematic review
, A. Gupta
Published in Elsevier B.V.
2019
Volume: 161
   
Pages: 466 - 474
Abstract
The occurrences of noisy data in data set can significantly impact prediction of any meaningful information. Many empirical studies have shown that noise in data set dramatically led to decreased classification accuracy and poor prediction results. Therefore, the problem of identifying and handling noise in prediction application has drawn considerable attention over past many years. In our study, we performed a systematic literature review of noise identification and handling studies published in various conferences and journals between January 1993 to July 2018. We have identified 79 primary studies are of noise identification and noise handling techniques. After investigating these studies, we found that among the noise identification schemes, the accuracy of identification of noisy instances by using ensemble-based techniques are better than other techniques. But regarding efficiency, usually single based techniques method is better; it is more suitable for noisy data sets. Among noise handling techniques, polishing techniques generally improve classification accuracy than filtering and robust techniques, but it introduced some errors in the data sets. © 2019 The Authors.
About the journal
JournalData powered by TypesetProcedia Computer Science
PublisherData powered by TypesetElsevier B.V.
ISSN18770509