Now a days the importance of analyzing the hidden sentiments from user reviews playing a prominent role towards increasing profitability in any organization. To address the challenges being faced in analyzing the text information and transforming the same in to polarities values with an objective of saving time in understanding the public opinion on particular product or service. Traditionally, there are different approaches carried out in transforming text data in to values based on different features of Text. In our research we make use of Stanford CoreNLP, Alias-i's Lingpipe (uses Logistic regression for document classification), Senti WordNet and synthesize libraries from different sources to include several other techniques that are used for text mining to evaluate the impact of feature selection on overall sentiment analysis by scoring a sentences in a review using different scoring Techniques. we also included NTU Lib Linear to make use of linear SVM for document classification. The Features considered on our experiments are Term Frequency and N-Gram (1Gram &amp; 2Gram) with Decision Tree as Prediction model to evaluate the Accuracy, Area under ROC Curve and Kappa value. Finally, Compared the polarities of the reviews obtained using three different sentiment scoring approaches. The findings in our research is, Term Frequency have good impact of (0.932) on classifying the sentiment, In contrast, 2Gram have an impact of (0.8505). © 2017 Association for Computing Machinery.

Dharmendra Singh Rajput

Department of Software and Systems Engineering

School of Information Technology and Engineering

Vellore Campus

SM Basha

Vellore Institute of Technology (VIT) is a private university located in&nbsp;Tamil Nadu, India. Founded in 1984, as Vellore Engineering College, the institution offers 20 undergraduate, 34 postgraduate, four integrated and four research programs. It has campuses in Vellore, Amravati, Bhopal and Chennai.

VIT is one of the top ranked private universities in India according to NIRF, THE and QS Rankings.&nbsp;Govt. of India has recognized&nbsp;VIT, Vellore as an&nbsp;Institution of Eminence. This has allowed VIT to take independent quality initiatives and move up in world ranking.

&nbsp;

&nbsp;

VIT University

Evaluating the Impact of Feature Selection on Overall Performance of Sentiment Analysis

Proceedings of the 2017 International Conference on Information Technology - ICIT 2017

In the digital era, the importance of extracting the hidden sentiments from user reviews plays a prominent role, to increase the profitability of an organization. The interest among, the research community in Sentiment Analysis (SA) has grown exponentially. But there are enormous challenges still being faced in the field of SA namely: Identification of sarcasm/Irony/Conditional/Modifier statements present in the review, Identification of Aspects and sentiment word as a pair (Data Transformation), Rating the recognized Aspects towards predicting the overall aggregated sentiment, Analyzing and designing issues towards implementing the parallel Aspect Level sentiment. In the present research work, We have addressed each of this challenges using a serial hybridization model, Where, the output of each step, is input to the following stage. First, towards identification sarcasm. In which, the dictionary is updated with the set of sentiment words by manually crafted rules. Next, to mitigate the discovery of sentiment and aspect word pair. In which, Latent Dirichlet Allocation (LDA), Gibbs sampling techniques are used. Next, to present the result of sentiment analysis as the overall rating of data considered, Latent Aspect Rating Regression (LARR) model is proposed (Data Presentation). Finally, addressed the designing issues (deciding numbers of mappers and reducers needed) towards implementing the parallel Aspect Level sentiment Analysis with the objective of improving the resource utilization in Big Data clusters. This work can help the researchers doing research in the field of speech recognition, development of recommended systems. The evaluation Metric used in estimating the performance of each step in our research are F-score, Rand Index, Classification accuracy and Root Mean Absolute Error (RMAE), Throughput. The findings of our research work help the customer to directly use the result obtained from the proposed model in the form of Aspect level rating. © 2019, Springer Science+Business Media, LLC, part of Springer Nature.

Multimedia Tools and Applications

A roadmap towards implementing parallel aspect level sentiment analysis

In Machine Learning and statistics attribute/feature selection is used in predictive model construction. This help the Machine in interpreting the features easily by discovering good insight and improves efficiency in predictive modeling. The objective of our research is to improve the classification accuracy by knowing the most important feature from any given dataset. In this research, we used two techniques namely Data partition and K Fold, in evaluating the importance of each feature from the randomly generated dataset with 5399 instances and 20 attributes. In Data partitioning, the attribute with lowest accuracy is filtered out. Where as in K Fold cross validation, attributes with biggest error is removed from the original dataset. In our experiments, the evaluation parameters considered are Recall. Precision and F-Measure. Finally the accuracy rate of both the techniques are compared. The finding in our research stats that K Fold approach achieves better accuracy of 97.03% than Data partitioning(96.11%) in estimating the importance of features in classification. © 2018 IEEE.

2018 8th International Conference on Communication Systems and Network Technologies (CSNT)

Evaluating the Importance of each Feature in Classification task

Fulltext

International Journal of Security and Its Applications

Analyzing the performance of Various Fraud Detection Techniques

International Journal of Advanced Science and Technology

A Soft Computing Approach to Provide Recommendation on PIMA Diabetes

Time series models the analyses of data in order to extract meaningful statistics and other characteristics of the data. Time series forecasting is the use of a model to predict future values based on previously observed values. While regression analysis is often employed in such a way as to test theories that the current values of one or more independent time. Here five-time series datasets with different range of observation are considered to study its performance. In this paper, moving averages (MA) of series with different periods to average over are calculated; plotted series for forecasted data against original data; compared the performance of HOLT-WINTERS with the Auto Regressive Integrated Moving Average (ARIMA) model with non-zero mean; and computed the statistic test to examining the null hypothesis for the considered time series datasets. © 2017 SERSC Australia.

International Journal of Grid and Distributed Computing

Comparative Study on Performance Analysis of Time Series Predictive Models

American Economic Journal: Macroeconomics

Zipf's Law, Pareto's Law, and the Evolution of Top Incomes in the United States

The Prague Bulletin of Mathematical Linguistics

Optimizing Tokenization Choice for Machine Translation across Multiple Target Languages

Remote Sensing of Environment

Major advances in geostationary fire radiative power (FRP) retrieval over Asia and Australia stemming from use of Himarawi-8 AHI

Journal	Data powered by TypesetProceedings of the 2017 International Conference on Information Technology - ICIT 2017
Publisher	Data powered by TypesetACM Press
Open Access	0