TY - GEN
T1 - Modeling naturalistic affective states via facial, vocal, and bodily expressions recognition
AU - Karpouzis, Kostas
AU - Caridakis, George
AU - Kessous, Loic
AU - Amir, Noam
AU - Raouzaiou, Amaryllis
AU - Malatesta, Lori
AU - Kollias, Stefanos
PY - 2007
Y1 - 2007
N2 - Affective and human-centered computing have attracted a lot of attention during the past years, mainly due to the abundance of devices and environments able to exploit multimodal input from the part of the users and adapt their functionality to their preferences or individual habits. In the quest to receive feedback from the users in an unobtrusive manner, the combination of facial and hand gestures with prosody information allows us to infer the users' emotional state, relying on the best performing modality in cases where one modality suffers from noise or bad sensing conditions. In this paper, we describe a multi-cue, dynamic approach to detect emotion in naturalistic video sequences. Contrary to strictly controlled recording conditions of audiovisual material, the proposed approach focuses on sequences taken from nearly real world situations. Recognition is performed via a 'Simple Recurrent Network' which lends itself well to modeling dynamic events in both user's facial expressions and speech. Moreover this approach differs from existing work in that it models user expressivity using a dimensional representation of activation and valence, instead of detecting discrete 'universal emotions', which are scarce in everyday human-machine interaction. The algorithm is deployed on an audiovisual database which was recorded simulating human-human discourse and, therefore, contains less extreme expressivity and subtle variations of a number of emotion labels.
AB - Affective and human-centered computing have attracted a lot of attention during the past years, mainly due to the abundance of devices and environments able to exploit multimodal input from the part of the users and adapt their functionality to their preferences or individual habits. In the quest to receive feedback from the users in an unobtrusive manner, the combination of facial and hand gestures with prosody information allows us to infer the users' emotional state, relying on the best performing modality in cases where one modality suffers from noise or bad sensing conditions. In this paper, we describe a multi-cue, dynamic approach to detect emotion in naturalistic video sequences. Contrary to strictly controlled recording conditions of audiovisual material, the proposed approach focuses on sequences taken from nearly real world situations. Recognition is performed via a 'Simple Recurrent Network' which lends itself well to modeling dynamic events in both user's facial expressions and speech. Moreover this approach differs from existing work in that it models user expressivity using a dimensional representation of activation and valence, instead of detecting discrete 'universal emotions', which are scarce in everyday human-machine interaction. The algorithm is deployed on an audiovisual database which was recorded simulating human-human discourse and, therefore, contains less extreme expressivity and subtle variations of a number of emotion labels.
KW - Affective interaction
KW - Facial expressions
KW - Hand gestures
KW - Multimodal analysis
KW - Neural networks
KW - Prosody
UR - http://www.scopus.com/inward/record.url?scp=49949087271&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-72348-6_5
DO - 10.1007/978-3-540-72348-6_5
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:49949087271
SN - 3540723463
SN - 9783540723462
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 91
EP - 112
BT - Artifical Intelligence for Human Computing, ICMI 2006 and IJCAI 2007 International Workshops, Banff, Canada, November 3, 2006 and Hyderabad, India, January 6, 2007, Revised Seleced and Invited Papers
T2 - 20th International Joint Conference on Artificial Intelligence, IJCAI 2007 - Workshop on Artifical Intelligence for Human Computing
Y2 - 6 January 2007 through 6 January 2007
ER -