تعداد نشریات | 418 |
تعداد شمارهها | 9,987 |
تعداد مقالات | 83,495 |
تعداد مشاهده مقاله | 76,810,014 |
تعداد دریافت فایل اصل مقاله | 53,906,621 |
NSE: An effective model for investigating the role of pre-processing using ensembles in sentiment classification | ||
Journal of Advances in Computer Research | ||
دوره 12، شماره 3 - شماره پیاپی 45، آبان 2021، صفحه 27-41 اصل مقاله (1.08 M) | ||
نوع مقاله: Original Manuscript | ||
نویسندگان | ||
Razieh Asgarnezhad* 1؛ Amirhassan Monadjemi2 | ||
1Department of Computer Engineering, Isfahan (Khorasgan) Branch, Islamic Azad University, Isfahan, Iran | ||
2Senior Lecturer, School of Computing, National University of Singapore, 119613, Singapore | ||
چکیده | ||
With the extensive Internet applications, review sentiment classification has attracted increasing interest among text mining experts. Traditional bag of words approaches did not indicate multiple relationships connecting words while emphasizing the pre-processing phase and data reduction techniques, making a huge performance difference in classification. This study suggests a model as a different efficient model for multi-class sentiment classification using sampling techniques, feature selection methods, and ensemble supervised classification to increase the performance of text classification. The feature selection phase of our model has applied n-grams, a computational method that optimizes feature selection procedure by extracting features based on the relationships of the words to improve a candidate selection of features. The proposed model classifies the sentiment of tweets and online reviews through ensemble methods, including boosting, bagging, stacking, and voting in conjunction with supervised methods. Besides, two sampling techniques were applied in the pre-processing phase. In the experimental study, a comprehensive range of comparative experiments was conducted to assess the effectiveness of our model using the best existing works in the literature on well-known movie reviews and Twitter datasets. The highest accuracy and f-measure for our model obtained 92.95 and 92.65% on the movie dataset, 90.61 and 87.73% on the Twitter dataset, respectively. | ||
کلیدواژهها | ||
Data Mining؛ Sentiment classification؛ Feature selection؛ Pre-processing؛ Ensembles | ||
آمار تعداد مشاهده مقاله: 127 تعداد دریافت فایل اصل مقاله: 70 |