基于韵律文本的三维口型动画

3D Visible Speech Animation Driven by Prosody Text

摘要: 基于韵律文本的三维口型动画为了生成韵律感强、真实度高且易于控制的三维口型动画,提出了一种基于韵律文本的三维口型动画合成方法.该方法首先将驱动动画所用的普通文本通过一种韵律标记语言转换成富含韵律信息的文本;分析从视频中提取出来的一般人发音特征曲线,得到曲线函数,然后通过该函数将单帧的静态视位扩展为多帧的动态视位;最后将韵律标记的属性值映射为曲线函数的参数值,在动画中增加了韵律效果.实验结果表明,在不同的韵律信息支持下动画结果明显改变.

Abstract: This paper proposes a new approach for generating realistic three-dimensional speech animation.The basic idea is to synthesize the animated faces using prosodic information edited by user with a kind of text markup language.By capturing characteristic trajectories of utterances from video clips, our technique builds up a parametric model based on the exponential formula.Based on this formula the static viseme is extended to dynamic one.To relate the prosody text with the 3D animation, the input attribute is mapped to be the value of formula parameter.Experimental results show that the proposed technique synthesizes animation of different effects depending on the availability with the prosodic information.