东北大学学报:自然科学版 ›› 2020, Vol. 41 ›› Issue (9): 1274-1279.DOI: 10.12068/j.issn.1005-3026.2020.09.010

• 机械工程 • 上一篇    下一篇

一种自适应步长的复合梯度加速优化算法

印明昂1, 王钰烁2, 孙志礼1, 于云飞3   

  1. (1. 东北大学 机械工程与自动化学院, 辽宁 沈阳110819;2. 中车长春轨道客车股份有限公司, 吉林 长春130062; 3. 中国航发沈阳发动机研究所, 辽宁 沈阳110015)
  • 收稿日期:2020-01-09 修回日期:2020-01-09 出版日期:2020-09-15 发布日期:2020-09-15
  • 通讯作者: 印明昂
  • 作者简介:印明昂(1985-),男,辽宁沈阳人,东北大学讲师,博士; 孙志礼(1957-),男,山东巨野人,东北大学教授,博士生导师.
  • 基金资助:
    国家自然科学基金资助项目(51775097, 51875095).

A Compound Gradient Acceleration Optimization Algorithm with Adaptive Step Size

YIN Ming-ang1, WANG Yu-shuo2, SUN Zhi-li1, YU Yun-fei3   

  1. 1.School of Mechanical Engineering & Automation, Northeastern University, Shenyang 110819, China; 2.CRRC Changchun Railway Vehicles Co.,Ltd., Changchun 130062, China; 3.AVIC Shenyang Engine Design Institute, Shenyang 110015, China.
  • Received:2020-01-09 Revised:2020-01-09 Online:2020-09-15 Published:2020-09-15
  • Contact: YIN Ming-ang
  • About author:-
  • Supported by:
    -

摘要: 自适应步长加速(Adam)类算法由于其计算效率高、兼容性好的特点,成为近期相关领域的研究热点.针对Adam收敛速度慢的问题,本文基于当前梯度、预测梯度以及历史动量梯度,提出一种新型Adam类一阶优化算法——复合梯度下降法(C-Adam),并对其收敛性进行了理论证明.与其他加速算法的区别之处在于,C-Adam将预测梯度与历史动量区别开,通过一次真实的梯度更新找到下一次迭代更精准的搜索方向.利用两组常用测试数据集及45钢静拉伸破坏实验的实验数据对所提算法进行验证,实验结果表明C-Adam与其他流行算法相比较具有更快的收敛速度及更小的训练损失.

关键词: 一阶优化算法, 复合梯度下降法, Logistic回归, 模式识别

Abstract: In related researches, a class of adaptive iteration step size accelerated(Adam) algorithms becomes a research hotspot because of its high computational efficiency and compatibility. To solve the problem of Adam′s low convergence rate, based on the combination of current gradient, prediction gradient and historical momentum gradient, this paper proposed a new kind of Adam algorithm named as compound gradient descent method(C-Adam), and proved its convergence. The difference between C-Adam and other acceleration algorithms is that C-Adam distinguishes the prediction gradient from the historical momentum, and finds a more accurate search direction for the next iteration through a real gradient update. Using two testing data sets and the data of 45 steel static tensile experiment to test the C-Adam, the results show that the algorithm has faster convergence speed and smaller training loss compared with other popular algorithms.

Key words: first-order optimization algorithm, compound gradient descent method, Logistic regression, pattern recognition

中图分类号: