Benchmarking Data Efficiency and Computational Efficiency of Temporal Action Localization Models

J. Warchocki; T. Oprescu; Y. Wang; A. Dămăcuș; P.M. Misterka; R. Bruintjes; A. Lengyel; O. Strafforello; J.C. van Gemert

Benchmarking Data Efficiency and Computational Efficiency of Temporal Action Localization Models

J. Warchocki, T. Oprescu, Y. Wang, A. Dămăcuș, P.M. Misterka, R. Bruintjes, A. Lengyel, O. Strafforello, J.C. van Gemert

Pattern Recognition and Bioinformatics

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

7 Downloads (Pure)

Abstract

In temporal action localization, given an input video, the goal is to predict which actions it contains, where they begin, and where they end. Training and testing current state-of- the-art deep learning models requires access to large amounts of data and computational power. However, gathering such data is challenging and computational resources might be limited. This work explores and measures how current deep temporal action localization models perform in settings constrained by the amount of data or computational power. We measure data efficiency by training each model on a subset of the training set. We find that TemporalMaxer outperforms other models in data-limited settings. Furthermore, we recommend TriDet when training time is limited. To test the efficiency of the models during inference, we pass videos of different lengths through each model. We find that TemporalMaxer requires the least computational resources, likely due to its simple architecture.

Original language	English
Title of host publication	Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops
Pages	3008-3016
Number of pages	9
Publication status	Published - 2023
Event	ICCV 2023: International Conference on Computer Vision - Paris, France Duration: 2 Oct 2023 → 6 Oct 2023

Conference

Conference	ICCV 2023: International Conference on Computer Vision
Country/Territory	France
City	Paris
Period	2/10/23 → 6/10/23

Access to Document

Warchocki_Benchmarking_Data_Efficiency_and_Computational_Efficiency_of_Temporal_Action_Localization_ICCVW_2023_paperFinal published version, 501 KB

Cite this

@inproceedings{85745663fc5343f88cec8fc1b4d2bd34,

title = "Benchmarking Data Efficiency and Computational Efficiency of Temporal Action Localization Models",

abstract = "In temporal action localization, given an input video, the goal is to predict which actions it contains, where they begin, and where they end. Training and testing current state-of- the-art deep learning models requires access to large amounts of data and computational power. However, gathering such data is challenging and computational resources might be limited. This work explores and measures how current deep temporal action localization models perform in settings constrained by the amount of data or computational power. We measure data efficiency by training each model on a subset of the training set. We find that TemporalMaxer outperforms other models in data-limited settings. Furthermore, we recommend TriDet when training time is limited. To test the efficiency of the models during inference, we pass videos of different lengths through each model. We find that TemporalMaxer requires the least computational resources, likely due to its simple architecture.",

author = "J. Warchocki and T. Oprescu and Y. Wang and A. D{\u a}m{\u a}cuș and P.M. Misterka and R. Bruintjes and A. Lengyel and O. Strafforello and {van Gemert}, J.C.",

year = "2023",

language = "English",

pages = "3008--3016",

booktitle = "Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops",

note = "ICCV 2023: International Conference on Computer Vision ; Conference date: 02-10-2023 Through 06-10-2023",

}

Warchocki, J, Oprescu, T, Wang, Y, Dămăcuș, A, Misterka, PM, Bruintjes, R , Lengyel, A , Strafforello, O & van Gemert, JC 2023, Benchmarking Data Efficiency and Computational Efficiency of Temporal Action Localization Models. in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops. pp. 3008-3016, ICCV 2023: International Conference on Computer Vision, Paris, France, 2/10/23.

Benchmarking Data Efficiency and Computational Efficiency of Temporal Action Localization Models. / Warchocki, J.; Oprescu, T.; Wang, Y. et al.
Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops. 2023. p. 3008-3016.

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

TY - GEN

T1 - Benchmarking Data Efficiency and Computational Efficiency of Temporal Action Localization Models

AU - Warchocki, J.

AU - Oprescu, T.

AU - Wang, Y.

AU - Dămăcuș, A.

AU - Misterka, P.M.

AU - Bruintjes, R.

AU - Lengyel, A.

AU - Strafforello, O.

AU - van Gemert, J.C.

PY - 2023

Y1 - 2023

N2 - In temporal action localization, given an input video, the goal is to predict which actions it contains, where they begin, and where they end. Training and testing current state-of- the-art deep learning models requires access to large amounts of data and computational power. However, gathering such data is challenging and computational resources might be limited. This work explores and measures how current deep temporal action localization models perform in settings constrained by the amount of data or computational power. We measure data efficiency by training each model on a subset of the training set. We find that TemporalMaxer outperforms other models in data-limited settings. Furthermore, we recommend TriDet when training time is limited. To test the efficiency of the models during inference, we pass videos of different lengths through each model. We find that TemporalMaxer requires the least computational resources, likely due to its simple architecture.

AB - In temporal action localization, given an input video, the goal is to predict which actions it contains, where they begin, and where they end. Training and testing current state-of- the-art deep learning models requires access to large amounts of data and computational power. However, gathering such data is challenging and computational resources might be limited. This work explores and measures how current deep temporal action localization models perform in settings constrained by the amount of data or computational power. We measure data efficiency by training each model on a subset of the training set. We find that TemporalMaxer outperforms other models in data-limited settings. Furthermore, we recommend TriDet when training time is limited. To test the efficiency of the models during inference, we pass videos of different lengths through each model. We find that TemporalMaxer requires the least computational resources, likely due to its simple architecture.

UR - https://openaccess.thecvf.com/content/ICCV2023W/CVEU/html/Warchocki_Benchmarking_Data_Efficiency_and_Computational_Efficiency_of_Temporal_Action_Localization_ICCVW_2023_paper.html

M3 - Conference contribution

SP - 3008

EP - 3016

BT - Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops

T2 - ICCV 2023: International Conference on Computer Vision

Y2 - 2 October 2023 through 6 October 2023

ER -

Benchmarking Data Efficiency and Computational Efficiency of Temporal Action Localization Models

Abstract

Conference

Access to Document

Other files and links

Fingerprint

Cite this