于建均, 韩春晓, 阮晓钢, 刘涛, 徐骢驰, 门玉森. 基于复合协方差函数的多任务模仿学习算法的研究与实现[J]. 北京工业大学学报, 2016, 42(4): 499-507. DOI: 10.11936/bjutxb2015030055
    引用本文: 于建均, 韩春晓, 阮晓钢, 刘涛, 徐骢驰, 门玉森. 基于复合协方差函数的多任务模仿学习算法的研究与实现[J]. 北京工业大学学报, 2016, 42(4): 499-507. DOI: 10.11936/bjutxb2015030055
    YU Jianjun, HAN Chunxiao, RUAN Xiaogang, LIU Tao, XU Congchi, MEN Yusen. Multitask Imitation Learning Algorithm Based on Composite Covariance Function[J]. Journal of Beijing University of Technology, 2016, 42(4): 499-507. DOI: 10.11936/bjutxb2015030055
    Citation: YU Jianjun, HAN Chunxiao, RUAN Xiaogang, LIU Tao, XU Congchi, MEN Yusen. Multitask Imitation Learning Algorithm Based on Composite Covariance Function[J]. Journal of Beijing University of Technology, 2016, 42(4): 499-507. DOI: 10.11936/bjutxb2015030055

    基于复合协方差函数的多任务模仿学习算法的研究与实现

    Multitask Imitation Learning Algorithm Based on Composite Covariance Function

    • 摘要: 针对多任务下机器人模仿学习控制策略的获取问题,构建复合协方差函数,采用高斯过程回归方法对示教机器人的示教行为样本点建立高斯过程回归模型,并对其中的超参数进行优化,从而得出模仿学习控制策略,模仿机器人应用控制策略完成模仿任务.以Braitenberg车为仿真实验研究对象,对其趋光、避障多任务的模仿学习进行研究.仿真实验研究结果表明:与基于单一协方差函数的模仿学习算法相比,基于复合协方差函数的模仿学习算法不仅能够实现单任务环境下的机器人模仿学习,而且能够实现多任务环境下的机器人模仿学习,且精度更高.任务环境改变实验研究结果表明该方法有很好的适应性.

       

      Abstract: To acquire the multitask robot imitation learning control strategy,a Gauss process regression( GPR) model was established to express the control strategy,a composite covariance function was constructed,and the sample points of the teaching behavior was used to optimized the hyperparameters in the GPR model. The control strategy was applied by the imitation robot to accomplish the imitation task.The Braitenberg vehicles were used as simulation object to research multitask( phototaxis and obstacle avoidance tasks) imitation learning. Simulation results indicate that compared with the imitation learning algorithm based on the single covariance function,the imitation learning algorithm based on the composite covariance function can not only realize single task imitation learning,but also realize multitask imitation learning,and the precision is higher. The simulation results in various task environments indicate that the method is adaptive.

       

    /

    返回文章
    返回