基于CMAC网络Sarsa (λ)学习的RoboCup守门员策略
CMAC-based Sarsa (λ) Learning Algorithm for RoboCup-soccer Goalkeeper
-
摘要: 针对RoboCup仿真组足球比赛场上状态复杂多变、同时供决策的信息大多为连续变量、智能体利用现有信息通常无法判断当前状态下最优动作的问题,以守门员为例,首先利用CMAC神经网络对连续状态空间泛化,然后在泛化后的状态上,采用Sarsa (λ)学习算法获取守门员的最优策略.通过在RoboCup仿真平台上进行仿真,实验结果表明,采用基于CMAC的Sarsa (λ)学习算法的守门员,经过一定时间的学习后,防守时间显著增长,防守效果明显优于其他算法,验证了本文所提方案的有效性.Abstract: RoboCup simulated soccer has a large and complex state space,at the same time the variables used for decision are usually continuous,that make it difficult for the agent to choose the optimal action.This paper presents the goalkeeper as a case study,based on CMAC neural network,the continuous state space is firstly generalized,and then the Sarsa (λ) learning algorithm is employed to find the optimal policy.The author empirically evaluated and compared the defending effect of the goalkeepers with different strategies.Simulation results show that the goalkeeper with the learning algorithm has better defending effect and its defending time increases obviously after a period of time.