唐恒亮, 唐滋芳, 董晨刚, 尹棋正, 海秋茹. 基于启发式强化学习的AGV路径规划[J]. 北京工业大学学报, 2021, 47(8): 895-903. DOI: 10.11936/bjutxb2020120013
    引用本文: 唐恒亮, 唐滋芳, 董晨刚, 尹棋正, 海秋茹. 基于启发式强化学习的AGV路径规划[J]. 北京工业大学学报, 2021, 47(8): 895-903. DOI: 10.11936/bjutxb2020120013
    TANG Hengliang, TANG Zifang, DONG Chengang, YIN Qizheng, HAI Qiuru. AGV Path Planning Based on Heuristic Reinforcement Learning[J]. Journal of Beijing University of Technology, 2021, 47(8): 895-903. DOI: 10.11936/bjutxb2020120013
    Citation: TANG Hengliang, TANG Zifang, DONG Chengang, YIN Qizheng, HAI Qiuru. AGV Path Planning Based on Heuristic Reinforcement Learning[J]. Journal of Beijing University of Technology, 2021, 47(8): 895-903. DOI: 10.11936/bjutxb2020120013

    基于启发式强化学习的AGV路径规划

    AGV Path Planning Based on Heuristic Reinforcement Learning

    • 摘要: 针对传统算法、智能算法与强化学习算法在自动引导小车(automated guided vehicle,AGV)路径规划中收敛速度慢、学习效率低的问题,提出一种启发式强化学习算法,并针对传统Qλ)算法,设计启发式奖励函数和启发式动作选择策略,以此强化智能体对优质行为的探索,提高算法学习效率.通过仿真对比实验,验证了基于改进Qλ)启发式强化学习算法在探索次数、规划时间、路径长度与路径转角上都具有一定的优势.

       

      Abstract: Aiming at problems of slow convergence speed and low learning efficiency of traditional algorithm, intelligent algorithm and reinforcement learning algorithm in automated guided vehicle (AGV) path planning, a heuristic reinforcement learning algorithm was proposed. For the traditional Q(λ) algorithm, the heuristic reward function and heuristic action selection strategy were designed to strengthen the agent's exploration of high-quality behaviors and improve the learning efficiency of the algorithm. Through the simulation and contrast experiments, the improved Q(λ) heuristic reinforcement learning algorithm has advantages in exploring times, planning time, path length and path corner.

       

    /

    返回文章
    返回