Objects do not disappear: Video object detection by single-frame object location anticipation

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review


Objects in videos are typically characterized by continuous smooth motion. We exploit continuous smooth motion in three ways. 1) Improved accuracy by using object motion as an additional source of supervision, which we obtain by anticipating object locations from a static keyframe. 2) Improved efficiency by only doing the expensive feature computations on a small subset of all frames. Because neighboring video frames are often redundant, we only compute features for a single static keyframe and predict object locations in subsequent frames. 3) Reduced annotation cost, where we only annotate the keyframe and use smooth pseudo-motion between keyframes. We demonstrate computational efficiency, annotation efficiency, and improved mean average precision compared to the state-of-the-art on four datasets: ImageNet VID, EPIC KITCHENS-55, YouTube-BoundingBoxes and Waymo Open dataset. Our source code is available at https://github.com/L-KID/Video-object-detection-by-location-anticipation.
Original languageEnglish
Title of host publicationProceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV)
EditorsCristina Ceballos
Place of PublicationPiscataway
Number of pages12
ISBN (Electronic)979-8-3503-0718-4
ISBN (Print)979-8-3503-0719-1
Publication statusPublished - 2023
Event2023 IEEE/CVF International Conference on Computer Vision (ICCV) - Paris, France
Duration: 1 Oct 20236 Oct 2023

Publication series

NameProceedings of the IEEE International Conference on Computer Vision
ISSN (Print)1550-5499


Conference2023 IEEE/CVF International Conference on Computer Vision (ICCV)

Bibliographical note

Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care
Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.


Dive into the research topics of 'Objects do not disappear: Video object detection by single-frame object location anticipation'. Together they form a unique fingerprint.

Cite this