基于特征选择技术的情感词权重计算
Weight Calculation of Emotional Word Based on Feature Selection Technique
-
摘要: 在文本情感分析中,情感词典的构建至关重要,然而目前这方面的研究大多集中在简单的词语极性判别上,有关情感词的权重赋值研究较少,且已有的权重赋值方法基本上都需要人工辅助来选取基准词,这给实际应用带来很大的困难.针对此问题,提出了一种自动的基于特征选择技术的情感词权重计算方法.首先提出了词语情感权重与文本情感倾向的相关假设;然后针对情感分类,结合二元分类的特性改进了信息增益(information gain,IG)和卡方统计量(chi-square,CHI),将特征选择技术应用于情感词权重计算.实验结果表明:将计算所得的带情感权重的情感词库用于文本情感分类能够提升分类精度.Abstract: It is very important for the text sentiment analysis to build an emotional dictionary. However,most of current researches in this area focus on the words' polarity discrimination. Researchers rarely study the weight assignment of emotional words,and methods on this already existed mostly need to select benchmark words through artificial ways. Using artificial ways brings great difficulty in practical application. To solve this problem,an automatic weight calculation approach of emotional words based on feature selection technique was proposed. Firstly some related assumptions between the emotional weight of words and the emotional tendency of texts were proposed; Then, centered around sentiment classification,the properties of binary classification was combined to improve information gain( IG) and chi-squarec( CHI); Finally,the improved feature selection methods to calculate the weight of emotional words were usesd. Experimental results show that using the emotion dictionary with the calculated weights in text sentiment classification can greatly improve the classification accuracy.