东北大学学报(自然科学版) ›› 2025, Vol. 46 ›› Issue (5): 1-9.DOI: 10.12068/j.issn.1005-3026.2025.20230183

• 信息与控制 •    

基于安全强化学习算法的电动汽车充电调度策略

潘恒欣1, 贾润达1,2(), 张树磊1   

  1. 1.东北大学 信息科学与工程学院,辽宁 沈阳 110819
    2.东北大学 流程工业综合自动化国家重点实验室,辽宁 沈阳 110819
  • 收稿日期:2023-06-30 出版日期:2025-05-15 发布日期:2025-08-07
  • 通讯作者: 贾润达
  • 作者简介:潘恒欣(2001—),男,江西宜春人,东北大学硕士研究生
  • 基金资助:
    国家自然科学基金资助项目(61873049)

Electric Vehicle Charging Scheduling Strategy Based on Safe Reinforcement Learning Algorithm

Heng-xin PAN1, Run-da JIA1,2(), Shu-lei ZHANG1   

  1. 1.School of Information Science & Engineering,Northeastern University,Shenyang 110819,China
    2.State Key Laboratory of Synthetical Automation of Process Industries,Northeastern University,Shenyang 110819,China.
  • Received:2023-06-30 Online:2025-05-15 Published:2025-08-07
  • Contact: Run-da JIA

摘要:

随着电动汽车数量的增加,强化学习在电动汽车充电调度中面临更多挑战,尤其是大规模应用带来的不确定性和维度灾难问题.针对上述问题,构建了一个居民区微电网模型,综合考虑电动汽车入网模式及其多种非线性充电模型.将充电调度问题建模为一个约束马尔可夫决策过程,并采用无模型的强化学习框架处理不确定性.针对维度灾难问题,设计了一种充放电策略,通过将电动汽车根据状态划分为不同集合,并由智能体向集合发送控制信号,从而减少动作空间维度.随后,利用基于拉格朗日约束的深度确定性策略梯度算法求解充电调度问题,同时引入安全过滤器以确保不违反硬性约束.数值仿真验证了该策略的有效性.

关键词: 电动汽车, 充电调度, 安全强化学习, 电动汽车入网模式, 非线性充电

Abstract:

As the number of electric vehicles (EVs) increases, reinforcement learning (RL) in EV charging scheduling faces challenges, particularly uncertainties and the curse of dimensionality from large‑scale applications. A microgrid model for residential areas, considering the vehicle‑to‑grid (V2G) mode and various nonlinear charging models is developed. The problem is formulated as a constrained Markov decision process (CMDP), with a model‑free RL framework to handle uncertainties. To address the curse of dimensionality, a strategy is designed where EVs are grouped by states, and agents send control signals to these sets, thus reducing the dimensionality of the action space. A Lagrangian deep deterministic policy gradient (LDDPG) algorithm is employed to solve the charging scheduling problem, with a safety filter ensuring constraint compliance. Numerical simulations validate the strategy’s effectiveness.

Key words: electric vehicle, charging scheduling, safe reinforcement learning, V2G mode, nonlinear charging

中图分类号: