Multimodal Joint Head Orientation Estimation in Interacting Groups via Proxemics and Interaction Dynamics

Stephanie Tan; David M.J. Tax; Hayley Hung

doi:10.1145/3448122

Multimodal Joint Head Orientation Estimation in Interacting Groups via Proxemics and Interaction Dynamics

Stephanie Tan, David M.J. Tax, Hayley Hung

Pattern Recognition and Bioinformatics

Research output: Contribution to journal › Article › Scientific › peer-review

5 Citations (Scopus)

78 Downloads (Pure)

Abstract

Human head orientation estimation has been of interest because head orientation serves as a cue to directed social attention. Most existing approaches rely on visual and high-fidelity sensor inputs and deep learning strategies that do not consider the social context of unstructured and crowded mingling scenarios. We show that alternative inputs, like speaking status, body location, orientation, and acceleration contribute towards head orientation estimation. These are especially useful in crowded and in-the-wild settings where visual features are either uninformative due to occlusions or prohibitive to acquire due to physical space limitations and concerns of ecological validity. We argue that head orientation estimation in such social settings needs to account for the physically evolving interaction space formed by all the individuals in the group. To this end, we propose an LSTM-based head orientation estimation method that combines the hidden representations of the group members. Our framework jointly predicts head orientations of all group members and is applicable to groups of different sizes. We explain the contribution of different modalities to model performance in head orientation estimation. The proposed model outperforms baseline methods that do not explicitly consider the group context, and generalizes to an unseen dataset from a different social event.

Original language	English
Article number	35
Number of pages	22
Journal	Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
Volume	5
Issue number	1
DOIs	https://doi.org/10.1145/3448122
Publication status	Published - 2021

Keywords

head orientation estimation
interaction dynamics
scene understanding

Access to Document

10.1145/3448122

3448122Final published version, 7.14 MBLicence: CC BY

Cite this

@article{7ee7c53f648b4ef3baaa46010a567f4c,

title = "Multimodal Joint Head Orientation Estimation in Interacting Groups via Proxemics and Interaction Dynamics",

abstract = "Human head orientation estimation has been of interest because head orientation serves as a cue to directed social attention. Most existing approaches rely on visual and high-fidelity sensor inputs and deep learning strategies that do not consider the social context of unstructured and crowded mingling scenarios. We show that alternative inputs, like speaking status, body location, orientation, and acceleration contribute towards head orientation estimation. These are especially useful in crowded and in-the-wild settings where visual features are either uninformative due to occlusions or prohibitive to acquire due to physical space limitations and concerns of ecological validity. We argue that head orientation estimation in such social settings needs to account for the physically evolving interaction space formed by all the individuals in the group. To this end, we propose an LSTM-based head orientation estimation method that combines the hidden representations of the group members. Our framework jointly predicts head orientations of all group members and is applicable to groups of different sizes. We explain the contribution of different modalities to model performance in head orientation estimation. The proposed model outperforms baseline methods that do not explicitly consider the group context, and generalizes to an unseen dataset from a different social event.",

keywords = "head orientation estimation, interaction dynamics, scene understanding",

author = "Stephanie Tan and Tax, {David M.J.} and Hayley Hung",

year = "2021",

doi = "10.1145/3448122",

language = "English",

volume = "5",

journal = "Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies",

issn = "2474-9567",

publisher = "Association for Computing Machinery (ACM)",

number = "1",

}

TY - JOUR

T1 - Multimodal Joint Head Orientation Estimation in Interacting Groups via Proxemics and Interaction Dynamics

AU - Tan, Stephanie

AU - Tax, David M.J.

AU - Hung, Hayley

PY - 2021

Y1 - 2021

N2 - Human head orientation estimation has been of interest because head orientation serves as a cue to directed social attention. Most existing approaches rely on visual and high-fidelity sensor inputs and deep learning strategies that do not consider the social context of unstructured and crowded mingling scenarios. We show that alternative inputs, like speaking status, body location, orientation, and acceleration contribute towards head orientation estimation. These are especially useful in crowded and in-the-wild settings where visual features are either uninformative due to occlusions or prohibitive to acquire due to physical space limitations and concerns of ecological validity. We argue that head orientation estimation in such social settings needs to account for the physically evolving interaction space formed by all the individuals in the group. To this end, we propose an LSTM-based head orientation estimation method that combines the hidden representations of the group members. Our framework jointly predicts head orientations of all group members and is applicable to groups of different sizes. We explain the contribution of different modalities to model performance in head orientation estimation. The proposed model outperforms baseline methods that do not explicitly consider the group context, and generalizes to an unseen dataset from a different social event.

AB - Human head orientation estimation has been of interest because head orientation serves as a cue to directed social attention. Most existing approaches rely on visual and high-fidelity sensor inputs and deep learning strategies that do not consider the social context of unstructured and crowded mingling scenarios. We show that alternative inputs, like speaking status, body location, orientation, and acceleration contribute towards head orientation estimation. These are especially useful in crowded and in-the-wild settings where visual features are either uninformative due to occlusions or prohibitive to acquire due to physical space limitations and concerns of ecological validity. We argue that head orientation estimation in such social settings needs to account for the physically evolving interaction space formed by all the individuals in the group. To this end, we propose an LSTM-based head orientation estimation method that combines the hidden representations of the group members. Our framework jointly predicts head orientations of all group members and is applicable to groups of different sizes. We explain the contribution of different modalities to model performance in head orientation estimation. The proposed model outperforms baseline methods that do not explicitly consider the group context, and generalizes to an unseen dataset from a different social event.

KW - head orientation estimation

KW - interaction dynamics

KW - scene understanding

UR - http://www.scopus.com/inward/record.url?scp=85103675109&partnerID=8YFLogxK

U2 - 10.1145/3448122

DO - 10.1145/3448122

M3 - Article

AN - SCOPUS:85103675109

SN - 2474-9567

VL - 5

JO - Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies

JF - Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies

IS - 1

M1 - 35

ER -

Multimodal Joint Head Orientation Estimation in Interacting Groups via Proxemics and Interaction Dynamics

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this