利用先验知识的Q-Learning路径规划算法研究

段建民; 陈强龙

doi:doi:10.3969/j.issn.1671-637x.2019.09.007

电光与控制, 2019, 26 (9): 29, 网络出版: 2020-12-20

利用先验知识的Q-Learning路径规划算法研究

Prior Knowledge Based Q-Learning Path Planning Algorithm

段建民陈强龙

作者单位

北京工业大学, 北京 100124

强化学习路径规划先验知识移动机器人 reinforcement learning path planning prior knowledge mobile robot Q-Learning Q-Learning

摘要

强化学习中基于马尔可夫决策过程的标准Q-Learning算法可以取得较优路径, 但是方法存在收敛速度慢及规划效率低等问题, 无法直接应用于真实环境。针对此问题, 提出一种基于势能场知识的Q-Learning移动机器人路径规划算法。通过引入环境的势能值作为搜索启发信息对Q值进行初始化, 从而在学习初期便能引导移动机器人快速收敛, 改变了传统强化学习过程的盲目性, 适用于真实环境中直接学习。仿真实验表明, 与现有的算法相比, 所提算法不仅提高了收敛速度, 而且还缩短了学习时间, 使得移动机器人能够迅速找到一条较优的无碰撞路径。

Abstract

The standard Q-Learning algorithm based on Markov decision process in reinforcement learning can obtain an optimum path, but the method has the shortcomings of slow convergence rate and low planning efficiency, and thus can not be directly applied to the real environment. This paper proposes a Q-Learning path planning algorithm for mobile robots based on the potential energy field knowledge.By introducing the potential energy value into the environment as the search heuristic information to initialize the Q value, the rapid convergence of the mobile robot can be guided in the early stage of learning, and the blindness of the traditional reinforcement learning process is avoided, which makes it suitable for direct learning in a real environment. The simulation result shows that: Compared with existing algorithms, the proposed algorithm not only improves the convergence speed, but also shortens the learning time, which can make the mobile robot find a better collision-free path quickly.

PDF全文

段建民, 陈强龙. 利用先验知识的Q-Learning路径规划算法研究[J]. 电光与控制, 2019, 26(9): 29. DUAN Jianmin, CHEN Qianglong. Prior Knowledge Based Q-Learning Path Planning Algorithm[J]. Electronics Optics & Control, 2019, 26(9): 29.

利用先验知识的Q-Learning路径规划算法研究

关于本站 Cookie 的使用提示

全站搜索

利用先验知识的Q-Learning路径规划算法研究

相关论文

相关资讯

关于本站 Cookie 的使用提示

全站搜索