东北大学学报(自然科学版) ›› 2024, Vol. 45 ›› Issue (10): 1386-1393.DOI: 10.12068/j.issn.1005-3026.2024.10.003

• 信息与控制 • 上一篇    

基于混合模型的强化学习在浮选过程优化控制中的应用

贾润达1(), 张东豪1, 郑君1, 李康2   

  1. 1.东北大学 信息科学与工程学院,辽宁 沈阳 110819
    2.矿冶科技集团有限公司 矿冶过程自动控制技术国家(北京市)重点实验室,北京 100160
  • 收稿日期:2023-05-29 出版日期:2024-10-31 发布日期:2024-12-31
  • 通讯作者: 贾润达
  • 作者简介:贾润达(1981-),男,辽宁沈阳人,东北大学副教授,博士生导师.
  • 基金资助:
    国家重点研发计划项目(2021YFC2902702)

Application of Reinforcement Learning Based on Hybrid Model in Optimal Control of Flotation Process

Run-da JIA1(), Dong-hao ZHANG1, Jun ZHENG1, Kang LI2   

  1. 1.School of Information Science & Engineering,Northeastern University,Shenyang 110819,China
    2.National (Beijing) Key Laboratory of Mining and Metallurgical Process Automatic Control Technology,Mining and Metallurgical Technology Group Co. ,Ltd. ,Beijing 100160,China.
  • Received:2023-05-29 Online:2024-10-31 Published:2024-12-31
  • Contact: Run-da JIA
  • About author:JIA Run-da,E-mail: jiarunda@ise.neu.edu.cn

摘要:

传统的优化控制方法很难在浮选过程状态发生变化时准确、快速做出决策,导致精矿品位和尾矿品位大幅度波动、出现产品质量不稳定.此外,浮选过程难以对精矿品位进行在线检测,导致其实用性下降.针对上述问题采用混合模型对浮选过程建模,并基于示例的安全增强值评估(safety augmented value estimation from demonstrations,SAVED)的强化学习算法,控制浮选溢出气泡的尺寸分布,从而间接实现对精矿品位和尾矿品位的控制.通过仿真实验验证了所提算法的有效性.与人工经验和数据驱动模型相比,基于混合模型的SAVED算法在保证安全约束的条件下能够实现更好的控制效果.

关键词: 浮选过程, 强化学习, 混合模型, 安全约束, 优化控制

Abstract:

Traditional optimization control methods are difficult to make accurate and rapid decisions when the state of the flotation process changes, resulting in significant fluctuations in the concentrate grade and tailings grade, and unstable product quality. In addition, the flotation process is difficult to detect the concentrate grade online, leading to a decrease in its practicality. In response to the above problems, a hybrid model is used to model the flotation process and a reinforcement learning algorithm based on safety augmented value estimation from demonstrations (SAVED) is used to control the size distribution of flotation overflow bubbles to indirectly control the concentrate grade and tailings grade. The effectiveness of the proposed algorithm is verified through simulation experiments. Compared with artifical experience and data-driven models, SAVED based on hybrid models is used to model the flotation process and control the size distribution of flotation overflow bubbles. The algorithms can achieve better control effects while ensuring safety constraints.

Key words: flotation process, reinforcement learning, hybrid model, safety constraints, optimal control

中图分类号: