Sökning: onr:"swepub:oai:DiVA.org:liu-186695" >
SipMaskv2: Enhanced...
SipMaskv2: Enhanced Fast Image and Video Instance Segmentation
-
- Cao, Jiale (författare)
- School of Electrical and Information Engineering, Tianjin University, Tianjin, China
-
- Pang, Yanwei (författare)
- School of Electrical and Information Engineering, Tianjin University, Tianjin, China
-
- Anwer, Rao Muhammad (författare)
- Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, UAE
-
visa fler...
-
- Cholakkal, Hisham (författare)
- Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, UAE
-
- Khan, Fahad Shahbaz, 1983- (författare)
- Linköpings universitet,Datorseende,Tekniska fakulteten,Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, UAE
-
- Shao, Ling (författare)
- Terminus Group, Beijing, China
-
visa färre...
-
(creator_code:org_t)
- IEEE, 2023
- 2023
- Engelska.
-
Ingår i: IEEE Transactions on Pattern Analysis and Machine Intelligence. - : IEEE. - 0162-8828 .- 1939-3539 .- 2160-9292. ; 45:3, s. 3798-3812
- Relaterad länk:
-
https://urn.kb.se/re...
-
visa fler...
-
https://doi.org/10.1...
-
visa färre...
Abstract
Ämnesord
Stäng
- We propose a fast single-stage method for both image and video instance segmentation, called SipMask, that preserves the instance spatial information by performing multiple sub-region mask predictions. The main module in our method is a light-weight spatial preservation (SP) module that generates a separate set of spatial coefficients for the sub-regions within a bounding-box, enabling a better delineation of spatially adjacent instances. To better correlate mask prediction with object detection, we further propose a mask alignment weighting loss and a feature alignment scheme. In addition, we identify two issues that impede the performance of single-stage instance segmentation and introduce two modules, including a sample selection scheme and an instance refinement module, to address these two issues. Experiments are performed on both image instance segmentation dataset MS COCO and video instance segmentation dataset YouTube-VIS. On MS COCO test-dev set, our method achieves a state-of-the-art performance. In terms of real-time capabilities, it outperforms YOLACT by a gain of 3.0% (mask AP) under the similar settings, while operating at a comparable speed. On YouTube-VIS validation set, our method also achieves promising results. The source code is available at https://github.com/JialeCao001/SipMask.
Ämnesord
- NATURVETENSKAP -- Data- och informationsvetenskap -- Datorseende och robotik (hsv//swe)
- NATURAL SCIENCES -- Computer and Information Sciences -- Computer Vision and Robotics (hsv//eng)
Nyckelord
- Image instance segmentation; video instance segmentation; real-time; single-stage method; spatial information preservation
Publikations- och innehållstyp
- ref (ämneskategori)
- art (ämneskategori)
Hitta via bibliotek
Till lärosätets databas