Learning object deformation and motion adaption for semi-supervised video object segmentation

Xiaoyang Zheng, Xin Tan, Jianming Guo, Lizhuang Ma

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We propose a novel method to solve the task of semi-supervised video object segmentation in this paper, where the mask annotation is only given at the first frame of the video sequence. A mask-propagation-based model is applied to learn the past and current information for segmentation. Besides, due to the scarcity of training data, image/mask pairs that model object deformation and shape variance are generated for the training phase. In addition, we generate the key flips between two adjacent frames for motion adaptation. The method works in an end-to-end way, without any online fine-tuning on test videos. Extensive experiments demonstrate that our method achieves competitive performance against state-of-the-art methods on benchmark datasets, covering cases with single object or multiple objects. We also conduct extensive ablation experiments to analyze the effectiveness of our proposed method.

Original languageEnglish
Title of host publicationProceedings of ICPR 2020 - 25th International Conference on Pattern Recognition
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages8655-8662
Number of pages8
ISBN (Electronic)9781728188089
DOIs
StatePublished - 2020
Externally publishedYes
Event25th International Conference on Pattern Recognition, ICPR 2020 - Virtual, Milan, Italy
Duration: 10 Jan 202115 Jan 2021

Publication series

NameProceedings - International Conference on Pattern Recognition
ISSN (Print)1051-4651

Conference

Conference25th International Conference on Pattern Recognition, ICPR 2020
Country/TerritoryItaly
CityVirtual, Milan
Period10/01/2115/01/21

Keywords

  • Motion adaption
  • Object deformation
  • Semi-supervision
  • Video object segmentation

Fingerprint

Dive into the research topics of 'Learning object deformation and motion adaption for semi-supervised video object segmentation'. Together they form a unique fingerprint.

Cite this