东北大学学报(自然科学版) ›› 2009, Vol. 30 ›› Issue (2): 279-282.DOI: -

• 论著 • 上一篇    下一篇

基于神经网络增强学习算法的工艺任务分配方法

苏莹莹;王宛山;王建荣;唐亮;   

  1. 东北大学机械工程与自动化学院;
  • 收稿日期:2013-06-22 修回日期:2013-06-22 出版日期:2009-02-15 发布日期:2013-06-22
  • 通讯作者: Su, Y.-Y.
  • 作者简介:-
  • 基金资助:
    教育部高等学校博士学科点专项科研基金资助项目(20060145017)

Research on task allocation of process planning based on reinforcement learning and neural network

Su, Ying-Ying (1); Wang, Wan-Shan (1); Wang, Jian-Rong (1); Tang, Liang (1)   

  1. (1) School of Mechanical Engineering and Automation, Northeastern University, Shenyang 110004, China
  • Received:2013-06-22 Revised:2013-06-22 Online:2009-02-15 Published:2013-06-22
  • Contact: Su, Y.-Y.
  • About author:-
  • Supported by:
    -

摘要: 在任务分配问题中,如果Markov决策过程模型的状态-动作空间很大就会出现"维数灾难".针对这一问题,提出一种基于BP神经网络的增强学习策略.利用BP神经网络良好的泛化能力,存储和逼近增强学习中状态-动作对的Q值,设计了基于Q学习的最优行为选择策略和Q学习的BP神经网络模型与算法.将所提方法应用于工艺任务分配问题,经过Matlab软件仿真实验,结果证实了该方法具有良好的性能和行为逼近能力.该方法进一步提高了增强学习理论在任务分配问题中的应用价值.

关键词: 任务分配, 工艺设计, 增强学习, Q学习, 神经网络

Abstract: Aiming at the curse of dimensionality caused by prodigiousness of state-action space for Markov decision-making process model, a kind of Q learning method based on neural network was proposed. The Q value of a state-action pair during reinforcement learning was approached and stored by means of the high generalizability of BP neural network, then the optimal strategy based on Q learning for selection of action and a BP neural network model and algorithm for Q learning were designed. The algorithm proposed was applied to task allocation of process planning, with a simulation done by the software Matlab. The result indicated that it has a good performance and the capability of action approach, and the method enhances the applicability of reinforcement learning in task allocation.

中图分类号: