东北大学学报(自然科学版) ›› 2023, Vol. 44 ›› Issue (11): 1548-1555.DOI: 10.12068/j.issn.1005-3026.2023.11.005

• 信息与控制 • 上一篇    下一篇

基于改进SNN-HRL的智能体路径规划算法

赵钊, 原培新, 唐俊文, 陈锦林   

  1. (东北大学 机械工程与自动化学院, 辽宁 沈阳110819)
  • 发布日期:2023-12-05
  • 通讯作者: 赵钊
  • 作者简介:赵钊(1997-),男,河北衡水人,东北大学硕士研究生; 原培新(1953-),男,辽宁营口人,东北大学教授.
  • 基金资助:
    -

Agent Path Planning Algorithm Based on Improved SNN-HRL

ZHAO Zhao, YUAN Pei-xin, TANG Jun-wen, CHEN Jin-lin   

  1. School of Mechanical Engineering & Automation, Northeastern University, Shenyang 110819, China.
  • Published:2023-12-05
  • Contact: YUAN Pei-xin
  • About author:-
  • Supported by:
    -

摘要: 针对SNN-HRL等传统Skill discovery类算法存在的探索困难问题,本文基于SNN-HRL算法提出了融合多种探索策略的分层强化学习算法MES-HRL,改进传统分层结构,算法包括探索轨迹、学习轨迹、路径规划三层.在探索轨迹层,训练智能体尽可能多地探索未知环境,为后续的训练过程提供足够的环境状态信息.在学习轨迹层,将探索轨迹层的训练结果作为“先验知识”用于该层训练,提高训练效率.在路径规划层,利用智能体之前获得的skill来完成路径规划任务.通过仿真对比MES-HRL与SNN-HRL算法在不同环境下的性能表现,仿真结果显示,MES-HRL算法解决了传统算法的探索问题,具有更出色的路径规划能力.

关键词: 深度强化学习;分层强化学习;路径规划;探索策略;Skill discovery方法

Abstract: Aiming at the difficult exploration problems of traditional Skill discovery algorithms such as SNN-HRL(stochastic neural networks for hierarchical reinforcement learning), this paper proposes a hierarchical reinforcement learning algorithm that integrates multiple exploration strategies(MES) based on SNN-HRL algorithm. The proposed algorithm improves the traditional hierarchical structure, including three layers: exploration trajectory layer, learning trajectory layer, and path planning layer. In the exploration trajectory layer, the trained agent can explore as many unknown environments as possible to provide sufficient environmental state information for the subsequent training process. In the learning trajectory layer, the training results of the exploration trajectory layer are used as “priori knowledge” for the training to improve the training efficiency. In the path planning layer, skill that agent has learned are used to complete the path planning task. By comparing the performance of the MES-HRL and SNN-HRL algorithms in different environments, the simulation results show that MES-HRL algorithm solves the exploration problem of the traditional version of the algorithm and has better path planning capabilities.

Key words: deep reinforcement learning; hierarchical reinforcement learning(HRL); path planning; exploration strategy; Skill discovery method

中图分类号: