语义蕴涵关系识别中的特征提取方法
Method of Feature Extraction in Recognizing Textual Entailment
-
摘要: 为了捕捉不同文本片段之间的语义推理结果,实现文本片段之间的推导,采用基于分类器的方法进行蕴涵关系的判断.特征的选取是影响分类器性能的关键因素,在采用基本的词汇特征的基础上,引入了句法特征以及语义特征.通过构建语义链的方法挖掘T和H之间的语义关联,并应用于不同的分类器检验语义特征的有效性.在公开评测的数据集RTE-3~RTE-5上评价系统的性能,AdaBoost与SVM分类器取得的准确率分别为61.0%和61.8%.t检验的结果表明:基于语义链的语义特征使得系统性能得到了显著的提高.Abstract: To capture the semantic inference result between different text fragments and resolve the reasoning problem of the text fragments,the classifier method was adopted to implement the entailment decision and the feature selection was the important factor influencing the classifier performance.The lexical features were applied in the system as the baseline and then the syntactic feature and the semantic feature were joined.The construction of lexical chains could mine the semantic relation between T and H and it had been used on the classifier to verify its effectiveness.The system performance had been evaluated on the data set of RTE-3 ~ RTE-5.The classifier of AdaBoost and SVM achieved the higher precision of 61.0% and 61.8%,respectively.The t-test results indicate that the semantic feature based on the lexical chain makes the system performance improve significantly.