Get all the updates for this publication
Automatic classification of customer reviews as either positive or negative has been of great interest among the academic and business community in the recent times. In this paper, an attempt has been made to represent the text documents using just eight representative terms (RT) viz. good, very good, excellent, recommended, bad, very bad, disgusting, and never recommended. Thus a new way of representing text documents as a structured data matrix has been created. A consistent classification accuracy of near 80% and above was achieved for datasets of various sizes ranging from 403 to 25000. The precision (P), recall(R) and F-Measure were also very consistent and comparable to the previously reported results. A comparative analysis of classification performance has been carried out using machine learning algorithms like Naïve Bayes (NB), Bayesian logistic regression (BLR), multi layer perceptron (MLP) etc., revealed that the proposed way of representing the text documents results in consistently superior performance. © 2005 - 2012 JATIT & LLS. All rights reserved.
Journal | Journal of Theoretical and Applied Information Technology |
---|---|
Publisher | Asian Research Publishing Network (ARPN) |
ISSN | 19928645 |