PUNet: Temporal Action Proposal Generation With Positive Unlabeled Learning Using Key Frame Annotations

Noor ul Sehr  Zia; Osman Semih Kayhan; Jan van Gemert

doi:10.1109/ICIP42928.2021.9506012

PUNet: Temporal Action Proposal Generation With Positive Unlabeled Learning Using Key Frame Annotations

Noor ul Sehr Zia, Osman Semih Kayhan, Jan van Gemert

Pattern Recognition and Bioinformatics

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

Abstract

Popular approaches to classifying action segments in long, realistic, untrimmed videos start with high quality action proposals. Current action proposal methods based on deep learning are trained on labeled video segments. Obtaining annotated segments for untrimmed videos is time consuming, expensive and error-prone as annotated temporal action boundaries are imprecise, subjective and inconsistent. By embracing this uncertainty we explore to significantly speed up temporal annotations by using just a single key frame label for each action instance instead of the inherently imprecise start and end frames. To tackle the class imbalance by using only a single frame, we evaluate an extremely simple Positive-Unlabeled algorithm (PU-learning). We demonstrate on THUMOS’14 and ActivityNet that using a single key frame label give good results while being significantly faster to annotate. In addition, we show that our simple method, PUNet ¹, is data-efficient which further reduces the need for expensive annotations.

Original language	English
Title of host publication	2021 IEEE International Conference on Image Processing (ICIP)
Subtitle of host publication	Proceedings
Place of Publication	Piscataway
Publisher	IEEE
Pages	2598-2602
Number of pages	5
ISBN (Electronic)	978-1-6654-4115-5
ISBN (Print)	978-1-6654-3102-6
DOIs	https://doi.org/10.1109/ICIP42928.2021.9506012
Publication status	Published - 2021
Event	2021 IEEE International Conference on Image Processing (ICIP) - Virtual at Anchorage, United States Duration: 19 Sept 2021 → 22 Sept 2021

Conference

Conference	2021 IEEE International Conference on Image Processing (ICIP)
Country/Territory	United States
City	Virtual at Anchorage
Period	19/09/21 → 22/09/21

Keywords

Proposal Generation
Action Localization
Positive-Unlabeled Learning

Access to Document

10.1109/ICIP42928.2021.9506012

Cite this

@inproceedings{65c551db46c842a68227d299f9697f30,

title = "PUNet: Temporal Action Proposal Generation With Positive Unlabeled Learning Using Key Frame Annotations",

abstract = "Popular approaches to classifying action segments in long, realistic, untrimmed videos start with high quality action proposals. Current action proposal methods based on deep learning are trained on labeled video segments. Obtaining annotated segments for untrimmed videos is time consuming, expensive and error-prone as annotated temporal action boundaries are imprecise, subjective and inconsistent. By embracing this uncertainty we explore to significantly speed up temporal annotations by using just a single key frame label for each action instance instead of the inherently imprecise start and end frames. To tackle the class imbalance by using only a single frame, we evaluate an extremely simple Positive-Unlabeled algorithm (PU-learning). We demonstrate on THUMOS{\textquoteright}14 and ActivityNet that using a single key frame label give good results while being significantly faster to annotate. In addition, we show that our simple method, PUNet 1, is data-efficient which further reduces the need for expensive annotations. ",

keywords = "Proposal Generation, Action Localization, Positive-Unlabeled Learning",

author = "Zia, {Noor ul Sehr} and Kayhan, {Osman Semih} and {van Gemert}, Jan",

year = "2021",

doi = "10.1109/ICIP42928.2021.9506012",

language = "English",

isbn = "978-1-6654-3102-6",

pages = "2598--2602",

booktitle = "2021 IEEE International Conference on Image Processing (ICIP)",

publisher = "IEEE",

address = "United States",

note = "2021 IEEE International Conference on Image Processing (ICIP) ; Conference date: 19-09-2021 Through 22-09-2021",

}

Zia, NUS, Kayhan, OS & van Gemert, J 2021, PUNet: Temporal Action Proposal Generation With Positive Unlabeled Learning Using Key Frame Annotations. in 2021 IEEE International Conference on Image Processing (ICIP): Proceedings., 9506012, IEEE, Piscataway, pp. 2598-2602, 2021 IEEE International Conference on Image Processing (ICIP), Virtual at Anchorage, United States, 19/09/21. https://doi.org/10.1109/ICIP42928.2021.9506012

PUNet: Temporal Action Proposal Generation With Positive Unlabeled Learning Using Key Frame Annotations. / Zia, Noor ul Sehr ; Kayhan, Osman Semih ; van Gemert, Jan.
2021 IEEE International Conference on Image Processing (ICIP): Proceedings. Piscataway: IEEE, 2021. p. 2598-2602 9506012.

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

TY - GEN

T1 - PUNet

T2 - 2021 IEEE International Conference on Image Processing (ICIP)

AU - Zia, Noor ul Sehr

AU - Kayhan, Osman Semih

AU - van Gemert, Jan

PY - 2021

Y1 - 2021

N2 - Popular approaches to classifying action segments in long, realistic, untrimmed videos start with high quality action proposals. Current action proposal methods based on deep learning are trained on labeled video segments. Obtaining annotated segments for untrimmed videos is time consuming, expensive and error-prone as annotated temporal action boundaries are imprecise, subjective and inconsistent. By embracing this uncertainty we explore to significantly speed up temporal annotations by using just a single key frame label for each action instance instead of the inherently imprecise start and end frames. To tackle the class imbalance by using only a single frame, we evaluate an extremely simple Positive-Unlabeled algorithm (PU-learning). We demonstrate on THUMOS’14 and ActivityNet that using a single key frame label give good results while being significantly faster to annotate. In addition, we show that our simple method, PUNet 1, is data-efficient which further reduces the need for expensive annotations.

AB - Popular approaches to classifying action segments in long, realistic, untrimmed videos start with high quality action proposals. Current action proposal methods based on deep learning are trained on labeled video segments. Obtaining annotated segments for untrimmed videos is time consuming, expensive and error-prone as annotated temporal action boundaries are imprecise, subjective and inconsistent. By embracing this uncertainty we explore to significantly speed up temporal annotations by using just a single key frame label for each action instance instead of the inherently imprecise start and end frames. To tackle the class imbalance by using only a single frame, we evaluate an extremely simple Positive-Unlabeled algorithm (PU-learning). We demonstrate on THUMOS’14 and ActivityNet that using a single key frame label give good results while being significantly faster to annotate. In addition, we show that our simple method, PUNet 1, is data-efficient which further reduces the need for expensive annotations.

KW - Proposal Generation

KW - Action Localization

KW - Positive-Unlabeled Learning

UR - http://www.scopus.com/inward/record.url?scp=85125582227&partnerID=8YFLogxK

U2 - 10.1109/ICIP42928.2021.9506012

DO - 10.1109/ICIP42928.2021.9506012

M3 - Conference contribution

SN - 978-1-6654-3102-6

SP - 2598

EP - 2602

BT - 2021 IEEE International Conference on Image Processing (ICIP)

PB - IEEE

CY - Piscataway

Y2 - 19 September 2021 through 22 September 2021

ER -

PUNet: Temporal Action Proposal Generation With Positive Unlabeled Learning Using Key Frame Annotations

Abstract

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this