FlowVOS: Weakly-Supervised Visual Warping for Detail-Preserving and Temporally Consistent Single-Shot Video Object Segmentation (BMVC 2021)

Abstract
We consider the task of semi-supervised video object segmentation (VOS). Our approach mitigates shortcomings in previous VOS work by addressing detail preservation and temporal consistency using visual warping. In contrast to prior work that uses full optical flow, we introduce a new foreground-targeted visual warping approach that learns flow fields from VOS data. We train a flow module to capture detailed motion between frames using two weakly-supervised losses. Our object-focused approach of warping previous foreground object masks to their positions in the target frame enables detailed mask refinement with fast runtimes without using extra flow supervision. It can also be integrated directly into state-of-the-art segmentation networks. On the DAVIS17 and YouTubeVOS benchmarks, we outperform state-of-the-art offline methods that do not use extra data, as well as many online methods that use extra data. Qualitatively, we also show our approach produces segmentations with high detail and temporal consistency.

Paper | Code (Coming Soon)
Supplementary Material
Citation (bibtex)
@InProceedings{gong2021flowvos,
author = {Gong, Julia and Holsinger, F. Christopher and Yeung, Serena},
title = {FlowVOS: Weakly-Supervised Visual Warping for Detail-Preserving and Temporally Consistent Single-Shot Video Object Segmentation},
booktitle = {Proceedings of the British Machine Vision Conference (BMVC)},
month = {November},
year = {2021}
}
Acknowledgements

This work was completed at Stanford MARVL. It was made possible by the Isackson Family Fund for Research in Head and Neck Surgery.
The documents contained in the directories of this work are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder, except when identified by Creative Commons License 2.0, in which case the license applies to both the original and modified versions of the images.