东北大学学报(自然科学版) ›› 2021, Vol. 42 ›› Issue (11): 1540-1546.DOI: 10.12068/j.issn.1005-3026.2021.11.004

• 信息与控制 • 上一篇    下一篇

一种基于深度学习的实时视频图像背景替换方法

谢天植, 雷为民, 张伟, 李志远   

  1. (东北大学 计算机科学与工程学院, 辽宁 沈阳110169)
  • 修回日期:2021-01-27 接受日期:2021-01-27 发布日期:2021-11-19
  • 通讯作者: 谢天植
  • 作者简介:谢天植(1996-),男,河北石家庄人,东北大学硕士研究生; 雷为民(1969-),男,山西平遥人,东北大学教授,博士生导师.
  • 基金资助:
    国家重点研发计划项目(2018YFB1702000); 中央高校基本科研业务费专项资金资助项目(N2016014).

A Real-Time Video Image Background Replacement Method Based on Deep Learning

XIE Tian-zhi, LEI Wei-min, ZHANG Wei, LI Zhi-yuan   

  1. School of Computer Science & Engineering, Northeastern University, Shenyang 110169, China.
  • Revised:2021-01-27 Accepted:2021-01-27 Published:2021-11-19
  • Contact: ZHANG Wei
  • About author:-
  • Supported by:
    -

摘要: 针对视频会话业务的实时性需求,提出一种轻量级深度学习网络模型实现视频图像的实时背景替换功能.网络模型包含语义分割和背景替换两个模块.语义分割模块整体架构采用编解码结构,编码端使用编码器模块、空洞卷积金字塔池化模块、注意力模块以及增益模块提取特征;解码端使用解码器模块、调整模块以及编码器模块恢复图像,再传入背景替换模块完成背景替换.该网络模型在本文设定的数据集训练后分割精确度达到94.1%,分割速度达到42.5帧/s,在实时性和准确性上达到较好的平衡,具有很好的实用效果.

关键词: 实时视频图像;背景替换;深度学习;语义分割;编解码结构

Abstract: Aiming at the real-time requirement of video session service, a lightweight deep learning network model is proposed to realize the real-time background replacement function of video images. The network model includes two modules: semantic segmentation and background replacement. The whole architecture of semantic segmentation module adopts encode-decoder structure. Encoder module, dilated convolution pyramid pooling module, attention module, and gain module are used in the encoding terminal to extract features. Decoder module, adjustment module, and encoder module are used in the decoding terminal to recover the image, and the background replacement module is used to complete the background replacement. After the data-set training, the segmentation accuracy of the network model reaches 94.1%, and the segmentation speed reaches 42.5 frames/s, which achieves a good balance between real-time and accuracy, and has a good practical effect.

Key words: real-time video image; background replacement; deep learning; semantic segmentation; encode-decode structure

中图分类号: