Towards theoretical performance limits of video parsing

A Hanjalic

Towards theoretical performance limits of video parsing

A Hanjalic

Multimedia Computing

Research output: Contribution to journal › Article › Scientific › peer-review

13 Citations (Scopus)

Abstract

Abstract This paper unravels the problem of temporal video segmentation, or video parsing, and explores the possibilities for defining theoretical limits for the expected performance of a general parsing algorithm. In particular, we address the challenge of computing the coherence of video content, which is critical to the ability of an algorithm to parse a video automatically. If this coherence is difficult to extract from video data, it is unrealistic to expect that any parsing algorithm applied to that data will perform optimally with respect to the ground truth, independent of the features and approach used. The measure of coherence computability (CC) we introduce in this paper is derived from the average uncertainty in extracting the content-related information from data, which translates into the uncertainty for making a decision about boundary presence at a given time stamp of a video. We argue that the introduced CC measure is more powerful in revealing the true quality of a video parsing algorithm than the classical comparison of parsing results with the ground truth. We also discuss how this measure can be employed to characterize and compare video sequences in terms of the expected parsing performance, and to interpret and evaluate the obtained parsing results accordingly

Original language	Undefined/Unknown
Pages (from-to)	261-272
Number of pages	12
Journal	IEEE Transactions on Circuits and Systems for Video Technology
Volume	17
Issue number	3
Publication status	Published - 2007

Keywords

Wiskunde en Informatica
Techniek
technische Wiskunde en Informatica
academic journal papers
CWTS 0.75 <= JFIS < 2.00

Cite this

@article{54373b5c0a204b96ac4561507f34f300,

title = "Towards theoretical performance limits of video parsing",

abstract = "Abstract This paper unravels the problem of temporal video segmentation, or video parsing, and explores the possibilities for defining theoretical limits for the expected performance of a general parsing algorithm. In particular, we address the challenge of computing the coherence of video content, which is critical to the ability of an algorithm to parse a video automatically. If this coherence is difficult to extract from video data, it is unrealistic to expect that any parsing algorithm applied to that data will perform optimally with respect to the ground truth, independent of the features and approach used. The measure of coherence computability (CC) we introduce in this paper is derived from the average uncertainty in extracting the content-related information from data, which translates into the uncertainty for making a decision about boundary presence at a given time stamp of a video. We argue that the introduced CC measure is more powerful in revealing the true quality of a video parsing algorithm than the classical comparison of parsing results with the ground truth. We also discuss how this measure can be employed to characterize and compare video sequences in terms of the expected parsing performance, and to interpret and evaluate the obtained parsing results accordingly",

keywords = "Wiskunde en Informatica, Techniek, technische Wiskunde en Informatica, academic journal papers, CWTS 0.75 <= JFIS < 2.00",

author = "A Hanjalic",

year = "2007",

language = "Undefined/Unknown",

volume = "17",

pages = "261--272",

journal = "IEEE Transactions on Circuits and Systems for Video Technology",

issn = "1051-8215",

publisher = "Institute of Electrical and Electronics Engineers (IEEE)",

number = "3",

}

TY - JOUR

T1 - Towards theoretical performance limits of video parsing

AU - Hanjalic, A

PY - 2007

Y1 - 2007

N2 - Abstract This paper unravels the problem of temporal video segmentation, or video parsing, and explores the possibilities for defining theoretical limits for the expected performance of a general parsing algorithm. In particular, we address the challenge of computing the coherence of video content, which is critical to the ability of an algorithm to parse a video automatically. If this coherence is difficult to extract from video data, it is unrealistic to expect that any parsing algorithm applied to that data will perform optimally with respect to the ground truth, independent of the features and approach used. The measure of coherence computability (CC) we introduce in this paper is derived from the average uncertainty in extracting the content-related information from data, which translates into the uncertainty for making a decision about boundary presence at a given time stamp of a video. We argue that the introduced CC measure is more powerful in revealing the true quality of a video parsing algorithm than the classical comparison of parsing results with the ground truth. We also discuss how this measure can be employed to characterize and compare video sequences in terms of the expected parsing performance, and to interpret and evaluate the obtained parsing results accordingly

AB - Abstract This paper unravels the problem of temporal video segmentation, or video parsing, and explores the possibilities for defining theoretical limits for the expected performance of a general parsing algorithm. In particular, we address the challenge of computing the coherence of video content, which is critical to the ability of an algorithm to parse a video automatically. If this coherence is difficult to extract from video data, it is unrealistic to expect that any parsing algorithm applied to that data will perform optimally with respect to the ground truth, independent of the features and approach used. The measure of coherence computability (CC) we introduce in this paper is derived from the average uncertainty in extracting the content-related information from data, which translates into the uncertainty for making a decision about boundary presence at a given time stamp of a video. We argue that the introduced CC measure is more powerful in revealing the true quality of a video parsing algorithm than the classical comparison of parsing results with the ground truth. We also discuss how this measure can be employed to characterize and compare video sequences in terms of the expected parsing performance, and to interpret and evaluate the obtained parsing results accordingly

KW - Wiskunde en Informatica

KW - Techniek

KW - technische Wiskunde en Informatica

KW - academic journal papers

KW - CWTS 0.75 <= JFIS < 2.00

UR - http://ieeexplore.ieee.org/iel5/76/4118229/04118239.pdf?isnumber=4118229&prod=JNL&arnumber=4118239&arSt=261&ared=272&arAuthor=Hanjalic%2C+A.

M3 - Article

SN - 1051-8215

VL - 17

SP - 261

EP - 272

JO - IEEE Transactions on Circuits and Systems for Video Technology

JF - IEEE Transactions on Circuits and Systems for Video Technology

IS - 3

ER -

Towards theoretical performance limits of video parsing

Abstract

Keywords

Other files and links

Cite this