Exploring Retrospective Annotation in Long-videos for Emotion Recognition

Patricia Bota; Pablo Cesar; Ana Fred; Hugo Placido da Silva

doi:10.1109/TAFFC.2024.3359706

Exploring Retrospective Annotation in Long-videos for Emotion Recognition

Patricia Bota, Pablo Cesar, Ana Fred, Hugo Placido da Silva

Multimedia Computing

Research output: Contribution to journal › Article › Scientific › peer-review

Abstract

Emotion Recognition systems are typically trained to classify a given psychophysiological state into emotion categories. Current platforms for emotion ground-truth collection show limitations for real-world scenarios of long-duration content (e.g., > 10m), namely: 1) Real-time annotation tools are distracting and become exhausting in a longer video; 2) Perform retrospective annotation of the whole content in bulk (providing highly coarse annotations); or 3) Are performed by external experts (depending on the number of annotators and their subjective experience). We explore a novel approach, the EmotiphAI Annotator, that allows undisturbed content visualisation and simplifies the annotation process by using segmentation algorithms that select brief clips for emotional annotation retrospectively. We compare three methods for content segmentation based on physiological data (Electrodermal Activity (EDA), emotion-based), scene (time-based), and random (control) selection. The EmotiphAI Annotator attained a B+ System Usability Scale score and low-average mental workload as per the NASA Task Load Index (40%). The reliability of the self-report was analysed by the inter-rater agreement (STD < 0.75), coherence across time segmentation methods (STD < 0.17), comparison against the SoA ground-truth (STD < 0.7), and correlation to EDA (> 0.3 to 0.8), where the method based on EDA obtained the overall best performance.

Original language	English
Pages (from-to)	1-12
Number of pages	12
Journal	IEEE Transactions on Affective Computing
DOIs	https://doi.org/10.1109/TAFFC.2024.3359706
Publication status	Accepted/In press - 2024

Keywords

Emotion recognition
Annotation
Physiological signals
Retrospective

Access to Document

10.1109/TAFFC.2024.3359706

Cite this

@article{cd631445c913465790db35f374aa771b,

title = "Exploring Retrospective Annotation in Long-videos for Emotion Recognition",

abstract = "Emotion Recognition systems are typically trained to classify a given psychophysiological state into emotion categories. Current platforms for emotion ground-truth collection show limitations for real-world scenarios of long-duration content (e.g., > 10m), namely: 1) Real-time annotation tools are distracting and become exhausting in a longer video; 2) Perform retrospective annotation of the whole content in bulk (providing highly coarse annotations); or 3) Are performed by external experts (depending on the number of annotators and their subjective experience). We explore a novel approach, the EmotiphAI Annotator, that allows undisturbed content visualisation and simplifies the annotation process by using segmentation algorithms that select brief clips for emotional annotation retrospectively. We compare three methods for content segmentation based on physiological data (Electrodermal Activity (EDA), emotion-based), scene (time-based), and random (control) selection. The EmotiphAI Annotator attained a B+ System Usability Scale score and low-average mental workload as per the NASA Task Load Index (40%). The reliability of the self-report was analysed by the inter-rater agreement (STD < 0.75), coherence across time segmentation methods (STD < 0.17), comparison against the SoA ground-truth (STD < 0.7), and correlation to EDA (> 0.3 to 0.8), where the method based on EDA obtained the overall best performance.",

keywords = "Emotion recognition, Annotation, Physiological signals, Retrospective",

author = "Patricia Bota and Pablo Cesar and Ana Fred and {da Silva}, {Hugo Placido}",

year = "2024",

doi = "10.1109/TAFFC.2024.3359706",

language = "English",

pages = "1--12",

journal = "IEEE Transactions on Affective Computing",

issn = "1949-3045",

publisher = "Institute of Electrical and Electronics Engineers (IEEE)",

}

TY - JOUR

T1 - Exploring Retrospective Annotation in Long-videos for Emotion Recognition

AU - Bota, Patricia

AU - Cesar, Pablo

AU - Fred, Ana

AU - da Silva, Hugo Placido

PY - 2024

Y1 - 2024

N2 - Emotion Recognition systems are typically trained to classify a given psychophysiological state into emotion categories. Current platforms for emotion ground-truth collection show limitations for real-world scenarios of long-duration content (e.g., > 10m), namely: 1) Real-time annotation tools are distracting and become exhausting in a longer video; 2) Perform retrospective annotation of the whole content in bulk (providing highly coarse annotations); or 3) Are performed by external experts (depending on the number of annotators and their subjective experience). We explore a novel approach, the EmotiphAI Annotator, that allows undisturbed content visualisation and simplifies the annotation process by using segmentation algorithms that select brief clips for emotional annotation retrospectively. We compare three methods for content segmentation based on physiological data (Electrodermal Activity (EDA), emotion-based), scene (time-based), and random (control) selection. The EmotiphAI Annotator attained a B+ System Usability Scale score and low-average mental workload as per the NASA Task Load Index (40%). The reliability of the self-report was analysed by the inter-rater agreement (STD < 0.75), coherence across time segmentation methods (STD < 0.17), comparison against the SoA ground-truth (STD < 0.7), and correlation to EDA (> 0.3 to 0.8), where the method based on EDA obtained the overall best performance.

AB - Emotion Recognition systems are typically trained to classify a given psychophysiological state into emotion categories. Current platforms for emotion ground-truth collection show limitations for real-world scenarios of long-duration content (e.g., > 10m), namely: 1) Real-time annotation tools are distracting and become exhausting in a longer video; 2) Perform retrospective annotation of the whole content in bulk (providing highly coarse annotations); or 3) Are performed by external experts (depending on the number of annotators and their subjective experience). We explore a novel approach, the EmotiphAI Annotator, that allows undisturbed content visualisation and simplifies the annotation process by using segmentation algorithms that select brief clips for emotional annotation retrospectively. We compare three methods for content segmentation based on physiological data (Electrodermal Activity (EDA), emotion-based), scene (time-based), and random (control) selection. The EmotiphAI Annotator attained a B+ System Usability Scale score and low-average mental workload as per the NASA Task Load Index (40%). The reliability of the self-report was analysed by the inter-rater agreement (STD < 0.75), coherence across time segmentation methods (STD < 0.17), comparison against the SoA ground-truth (STD < 0.7), and correlation to EDA (> 0.3 to 0.8), where the method based on EDA obtained the overall best performance.

KW - Emotion recognition

KW - Annotation

KW - Physiological signals

KW - Retrospective

UR - http://www.scopus.com/inward/record.url?scp=85184323608&partnerID=8YFLogxK

U2 - 10.1109/TAFFC.2024.3359706

DO - 10.1109/TAFFC.2024.3359706

M3 - Article

AN - SCOPUS:85184323608

SN - 1949-3045

SP - 1

EP - 12

JO - IEEE Transactions on Affective Computing

JF - IEEE Transactions on Affective Computing

ER -

Exploring Retrospective Annotation in Long-videos for Emotion Recognition

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this