A Semi-Markov CRF Model Approach to Encyclopedia Text Topic Segmentation
-
Graphical Abstract
-
Abstract
This paper introduced the semi-markov Conditional Random Fields(semi-CRFs)model based method for Chinese Encyclopedia text topic segmentation.The authors adopted HMM model state posterior as the basic segmentation clue which was adjusted to each text instance to overcome the topic duplication problem of fully connected state HMM model and CRF model.The authors also used several segment level word semantic features derived from domain thesaurus,and additional topic specific cue phrases to make the method more adapted to target domain.The experiment result showed that this method was suitable for Chinese Encyclopedia text topic structure and achieved better performance than HMM model and CRF model.
-
-