Investigating transformers in the decomposition of polygonal shapes as point collections

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

Abstract

Transformers can generate predictions in two approaches: 1. auto-regressively by conditioning each sequence element on the previous ones, or 2. directly produce an output sequences in parallel. While research has mostly explored upon this difference on sequential tasks in NLP, we study the difference between auto-regressive and parallel prediction on visual set prediction tasks, and in particular on polygonal shapes in images because polygons are representative of numerous types of objects, such as buildings or obstacles for aerial vehicles. This is challenging for deep learning architectures as a polygon can consist of a varying carnality of points. We provide evidence on the importance of natural orders for Transformers, and show the benefit of decomposing complex polygons into collections of points in an auto-regressive manner.
Original languageEnglish
Title of host publication2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)
Subtitle of host publicationProceedings
EditorsL. O'Conner
Place of PublicationPiscataway
PublisherIEEE
Pages2076-2085
Number of pages10
ISBN (Electronic)978-1-6654-0191-3
ISBN (Print)978-1-6654-0192-0
DOIs
Publication statusPublished - 2021
Event2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) - Virtual at Montreal, Canada
Duration: 11 Oct 202117 Oct 2021

Conference

Conference2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)
CountryCanada
CityVirtual at Montreal
Period11/10/2117/10/21

Fingerprint

Dive into the research topics of 'Investigating transformers in the decomposition of polygonal shapes as point collections'. Together they form a unique fingerprint.

Cite this