基于粘连符号分割和多特征融合的手写公式识别

    Handwritten Formula Recognition Based on Segmentation of Adhesive Symbols and Multi-feature Fusion

    • 摘要: 为了解决字符粘连影响脱机手写数学公式自动识别的问题,提出一种基于字符轮廓特征的单点粘连符号切分方法.首先利用字符上下两侧轮廓方向码信息得到切分点和切分方向;然后结合宽度、高度、角点个数、投影轮廓等几何特性对切分后的字符片段实现多特征融合的特殊符号识别,并将特殊符号从整体数学公式中进行有效分离;最后结合特殊符号与周围字符的上下左右、重叠、半包围等空间位置关系特性实现结构的解析,并将经过卷积神经网络识别后的普通字符代入结构解析序列,实现公式的整体识别.实验结果表明:该方法能有效处理数学公式中粘连情况及特殊符号识别;粘连符号的切分准确率达到87.25%,提高了手写数学公式的整体识别率.

       

      Abstract: To solve the problem that character adhesion affects the automatic recognition of offline handwritten mathematical formulas, a segmentation method of single-point adhesive characters was proposed in this paper based on character contour features. The segmentation point and directions were obtained by using the information of contour direction codes on the upper and lower sides of characters. Then, the multi-feature fusion was realized by combining the geometric characteristics such as width, height, number of corners and projection contour. Finally, combined with the spatial position relationship characteristics of the special symbols and the surrounding characters, such as up, down, left, right, overlap and half encirclement, the structure analysis was conducted, and the common characters recognized by convolution neural network were substituted by the structure analysis sequence to elicit the overall recognition of the formula. The experimental results show that this method can effectively deal with the adhesion and special symbol recognition in mathematical formulas. The segmentation accuracy of adhesive characters has reached 87.25%. At the same time, the overall recognition rate of handwritten mathematical formula is further improved.

       

    /

    返回文章
    返回