跳到主要导航 跳到搜索 跳到主要内容

基 于 动 态 采 样 对 偶 可 变 形 网 络 的 实 时 视 频 实 例 分 割

  • Yiran Song
  • , Qianyu Zhou
  • , Zhiwen Shao
  • , Ran Yi
  • , Lizhuang Ma*
  • *此作品的通讯作者
  • Shanghai Jiao Tong University
  • China University of Mining and Technology

科研成果: 期刊稿件文章同行评审

摘要

The dynamic sampling dual deformable network (DSDDN) was proposed in order to enhance the inference speed of video instance segmentation by better using temporal information within video frames. A dynamic sampling strategy was employed, which adjusted the sampling policy based on the similarity between consecutive frames. The inference process for the current frame was skipped for frames with high similarity by utilizing only segmentation results from the preceding frame for straightforward transfer computation. Frames with a larger temporal span were dynamically aggregated for frames with low similarity in order to enhance information for the current frame. Two deformable operations were additionally incorporated within the Transformer structure to circumvent the exponential computational cost associated with attention-based methods. The complex network was optimized through carefully designed tracking heads and loss functions. The proposed method achieves an inference accuracy of 39.1% mAP and an inference speed of 40.2 frames per second on the YouTube-VIS dataset, validating the effectiveness of the approach in achieving a favorable balance between accuracy and speed in real-time video segmentation tasks.

投稿的翻译标题Dynamic sampling dual deformable network for online video instance segmentation
源语言繁体中文
页(从-至)247-256
页数10
期刊Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science)
58
2
DOI
出版状态已出版 - 2月 2024
已对外发布

关键词

  • dual deformable network
  • dynamic network
  • instance segmentation
  • online inference
  • video

指纹

探究 '基 于 动 态 采 样 对 偶 可 变 形 网 络 的 实 时 视 频 实 例 分 割' 的科研主题。它们共同构成独一无二的指纹。

引用此