Detecting conversational groups in images and sequences: A robust game-theoretic approach

Sebastiano Vascon; Eyasu Z. Mequanint; Marco Cristani; HS Hung; Marcello Pelillo; Vittorio Murino

doi:10.1016/j.cviu.2015.09.012

Detecting conversational groups in images and sequences: A robust game-theoretic approach

Sebastiano Vascon, Eyasu Z. Mequanint, Marco Cristani, HS Hung, Marcello Pelillo, Vittorio Murino

Pattern Recognition and Bioinformatics

Research output: Contribution to journal › Article › Scientific › peer-review

47 Citations (Scopus)

Abstract

Detecting groups is becoming of relevant interest as an important step for scene (and especially activity) understanding. Differently from what is commonly assumed in the computer vision community, different types of groups do exist, and among these, standing conversational groups (a.k.a. F-formations) play an important role. An F-formation is a common type of people aggregation occurring when two or more persons sustain a social interaction, such as a chat at a cocktail party. Indeed, detecting and subsequently classifying such an interaction in images or videos is of considerable importance in many applicative contexts, like surveillance, social signal processing, social robotics or activity classification, to name a few. This paper presents a principled method to approach to this problem grounded upon the socio-psychological concept of an F-formation. More specifically, a game-theoretic framework is proposed, aimed at modeling the spatial structure characterizing F-formations. In other words, since F-formations are subject to geometrical configurations on how humans have to be mutually located and oriented, the proposed solution is able to account for these constraints while also statistically modeling the uncertainty associated with the position and orientation of the engaged persons. Moreover, taking advantage of video data, it is also able to integrate temporal information over multiple frames utilizing the recent notions from multi-payoff evolutionary game theory. The experiments have been performed on several benchmark datasets, consistently showing the superiority of the proposed approach over the state of the art, and its robustness under severe noise conditions.

Original language	English
Pages (from-to)	11-24
Number of pages	14
Journal	Computer Vision and Image Understanding
Volume	143
DOIs	https://doi.org/10.1016/j.cviu.2015.09.012
Publication status	Published - 2016

Access to Document

10.1016/j.cviu.2015.09.012

Cite this

@article{35150f50c8db48e89cde302f3352eb7a,

title = "Detecting conversational groups in images and sequences: A robust game-theoretic approach",

abstract = "Detecting groups is becoming of relevant interest as an important step for scene (and especially activity) understanding. Differently from what is commonly assumed in the computer vision community, different types of groups do exist, and among these, standing conversational groups (a.k.a. F-formations) play an important role. An F-formation is a common type of people aggregation occurring when two or more persons sustain a social interaction, such as a chat at a cocktail party. Indeed, detecting and subsequently classifying such an interaction in images or videos is of considerable importance in many applicative contexts, like surveillance, social signal processing, social robotics or activity classification, to name a few. This paper presents a principled method to approach to this problem grounded upon the socio-psychological concept of an F-formation. More specifically, a game-theoretic framework is proposed, aimed at modeling the spatial structure characterizing F-formations. In other words, since F-formations are subject to geometrical configurations on how humans have to be mutually located and oriented, the proposed solution is able to account for these constraints while also statistically modeling the uncertainty associated with the position and orientation of the engaged persons. Moreover, taking advantage of video data, it is also able to integrate temporal information over multiple frames utilizing the recent notions from multi-payoff evolutionary game theory. The experiments have been performed on several benchmark datasets, consistently showing the superiority of the proposed approach over the state of the art, and its robustness under severe noise conditions.",

author = "Sebastiano Vascon and Mequanint, {Eyasu Z.} and Marco Cristani and HS Hung and Marcello Pelillo and Vittorio Murino",

year = "2016",

doi = "10.1016/j.cviu.2015.09.012",

language = "English",

volume = "143",

pages = "11--24",

journal = "Computer Vision and Image Understanding",

issn = "1077-3142",

publisher = "Academic Press",

}

TY - JOUR

T1 - Detecting conversational groups in images and sequences

T2 - A robust game-theoretic approach

AU - Vascon, Sebastiano

AU - Mequanint, Eyasu Z.

AU - Cristani, Marco

AU - Hung, HS

AU - Pelillo, Marcello

AU - Murino, Vittorio

PY - 2016

Y1 - 2016

N2 - Detecting groups is becoming of relevant interest as an important step for scene (and especially activity) understanding. Differently from what is commonly assumed in the computer vision community, different types of groups do exist, and among these, standing conversational groups (a.k.a. F-formations) play an important role. An F-formation is a common type of people aggregation occurring when two or more persons sustain a social interaction, such as a chat at a cocktail party. Indeed, detecting and subsequently classifying such an interaction in images or videos is of considerable importance in many applicative contexts, like surveillance, social signal processing, social robotics or activity classification, to name a few. This paper presents a principled method to approach to this problem grounded upon the socio-psychological concept of an F-formation. More specifically, a game-theoretic framework is proposed, aimed at modeling the spatial structure characterizing F-formations. In other words, since F-formations are subject to geometrical configurations on how humans have to be mutually located and oriented, the proposed solution is able to account for these constraints while also statistically modeling the uncertainty associated with the position and orientation of the engaged persons. Moreover, taking advantage of video data, it is also able to integrate temporal information over multiple frames utilizing the recent notions from multi-payoff evolutionary game theory. The experiments have been performed on several benchmark datasets, consistently showing the superiority of the proposed approach over the state of the art, and its robustness under severe noise conditions.

AB - Detecting groups is becoming of relevant interest as an important step for scene (and especially activity) understanding. Differently from what is commonly assumed in the computer vision community, different types of groups do exist, and among these, standing conversational groups (a.k.a. F-formations) play an important role. An F-formation is a common type of people aggregation occurring when two or more persons sustain a social interaction, such as a chat at a cocktail party. Indeed, detecting and subsequently classifying such an interaction in images or videos is of considerable importance in many applicative contexts, like surveillance, social signal processing, social robotics or activity classification, to name a few. This paper presents a principled method to approach to this problem grounded upon the socio-psychological concept of an F-formation. More specifically, a game-theoretic framework is proposed, aimed at modeling the spatial structure characterizing F-formations. In other words, since F-formations are subject to geometrical configurations on how humans have to be mutually located and oriented, the proposed solution is able to account for these constraints while also statistically modeling the uncertainty associated with the position and orientation of the engaged persons. Moreover, taking advantage of video data, it is also able to integrate temporal information over multiple frames utilizing the recent notions from multi-payoff evolutionary game theory. The experiments have been performed on several benchmark datasets, consistently showing the superiority of the proposed approach over the state of the art, and its robustness under severe noise conditions.

U2 - 10.1016/j.cviu.2015.09.012

DO - 10.1016/j.cviu.2015.09.012

M3 - Article

SN - 1077-3142

VL - 143

SP - 11

EP - 24

JO - Computer Vision and Image Understanding

JF - Computer Vision and Image Understanding

ER -

Detecting conversational groups in images and sequences: A robust game-theoretic approach

Abstract

Access to Document

Fingerprint

Cite this