基于Q学习算法和遗传算法的动态环境路径规划

于乃功; 王琛; 默凡凡; 蔡建羡

doi:10.11936/bjutxb2016120005

基于Q学习算法和遗传算法的动态环境路径规划

Dynamic Environment Path Planning Based on Q-Learning Algorithm and Genetic Algorithm

摘要

摘要: 针对Q学习算法在动态连续环境中应用时因状态连续、数量过多，导致Q值表出现存储空间不足和维数灾的问题，提出了一种新的Q值表设计方法，并设计了适用于连续环境的R值和动作.不同于以状态-动作为索引，将时间离散化为时刻，以时刻-动作为索引来建立Q值表.将在某状态应选择某一动作的问题转化为在某时刻应选择某一动作的问题，实现了Q学习算法在动态连续环境中的应用.采用了先利用遗传算法进行静态全局路径规划，然后利用Q学习算法进行动态避障.整个方法为一种先"离线"后"在线"的分层路径规划方法，成功实现了移动机器人的路径规划.仿真结果验证了所提出方法的有效性.

Abstract: Different from the past with the state-action as the index, a method of establishing Q-value table by discretizing time was introduced. The problem of selecting an action in a certain state was transformed into the problem of choosing an action at a certain time, which achieved the application of Q learning algorithm in dynamic continuous environment. Firstly a genetic algorithm for global path planning was adopted. Then the obstacle was dynamically avoided through Q-learning. The whole system followed a successive "offline" and "online" multi-layer path planning philosophy. Indicated by the experiment results, a path planning system of mobile robot is achieved, and the proposed methods are state-of-the-art.

HTML全文

参考文献(18)

施引文献

资源附件(0)