Multimodal Detection and Classification of Head Movements in Face-to-Face Conversations

Department of Nordic Studies and Linguistics

Multimodal Detection and Classification of Head Movements in Face-to-Face Conversations: Exploring Models, Features and Their Interaction

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Standard

Multimodal Detection and Classification of Head Movements in Face-to-Face Conversations : Exploring Models, Features and Their Interaction. / Agirrezabal, Manex; Paggio, Patrizia; Navarretta, Costanza; Jongejan, Bart.

Gesture and Speech in Interaction (GESPIN 2023). Nijmegen : Max Planck Institut for Psycholinguistics, 2023.

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Harvard

Agirrezabal, M, Paggio, P, Navarretta, C & Jongejan, B 2023, Multimodal Detection and Classification of Head Movements in Face-to-Face Conversations: Exploring Models, Features and Their Interaction. in Gesture and Speech in Interaction (GESPIN 2023). Max Planck Institut for Psycholinguistics, Nijmegen.

APA

Agirrezabal, M., Paggio, P., Navarretta, C., & Jongejan, B. (2023). Multimodal Detection and Classification of Head Movements in Face-to-Face Conversations: Exploring Models, Features and Their Interaction. In Gesture and Speech in Interaction (GESPIN 2023) Max Planck Institut for Psycholinguistics.

Vancouver

Agirrezabal M, Paggio P, Navarretta C, Jongejan B. Multimodal Detection and Classification of Head Movements in Face-to-Face Conversations: Exploring Models, Features and Their Interaction. In Gesture and Speech in Interaction (GESPIN 2023). Nijmegen: Max Planck Institut for Psycholinguistics. 2023

Author

Agirrezabal, Manex ; Paggio, Patrizia ; Navarretta, Costanza ; Jongejan, Bart. / Multimodal Detection and Classification of Head Movements in Face-to-Face Conversations : Exploring Models, Features and Their Interaction. Gesture and Speech in Interaction (GESPIN 2023). Nijmegen : Max Planck Institut for Psycholinguistics, 2023.

Bibtex

@inproceedings{8a9377ce052142f7abe337dd791ea7e2,

title = "Multimodal Detection and Classification of Head Movements in Face-to-Face Conversations: Exploring Models, Features and Their Interaction",

abstract = "In this work we perform multimodal detection and classificationof head movements from face to face video conversation data.We have experimented with different models and feature setsand provided some insight on the effect of independent features,but also how their interaction can enhance a head movementclassifier. Used features include nose, neck and mid hip positioncoordinates and their derivatives together with acoustic features,namely, intensity and pitch of the speaker on focus. Resultsshow that when input features are sufficiently processed by in-teracting with each other, a linear classifier can reach a similarperformance to a more complex non-linear neural model withseveral hidden layers. Our best models achieve state-of-the-artperformance in the detection task, measured by macro-averagedF1 score.",

author = "Manex Agirrezabal and Patrizia Paggio and Costanza Navarretta and Bart Jongejan",

year = "2023",

language = "English",

booktitle = "Gesture and Speech in Interaction (GESPIN 2023)",

publisher = "Max Planck Institut for Psycholinguistics",

}

RIS

TY - GEN

T1 - Multimodal Detection and Classification of Head Movements in Face-to-Face Conversations

T2 - Exploring Models, Features and Their Interaction

AU - Agirrezabal, Manex

AU - Paggio, Patrizia

AU - Navarretta, Costanza

AU - Jongejan, Bart

PY - 2023

Y1 - 2023

N2 - In this work we perform multimodal detection and classificationof head movements from face to face video conversation data.We have experimented with different models and feature setsand provided some insight on the effect of independent features,but also how their interaction can enhance a head movementclassifier. Used features include nose, neck and mid hip positioncoordinates and their derivatives together with acoustic features,namely, intensity and pitch of the speaker on focus. Resultsshow that when input features are sufficiently processed by in-teracting with each other, a linear classifier can reach a similarperformance to a more complex non-linear neural model withseveral hidden layers. Our best models achieve state-of-the-artperformance in the detection task, measured by macro-averagedF1 score.

AB - In this work we perform multimodal detection and classificationof head movements from face to face video conversation data.We have experimented with different models and feature setsand provided some insight on the effect of independent features,but also how their interaction can enhance a head movementclassifier. Used features include nose, neck and mid hip positioncoordinates and their derivatives together with acoustic features,namely, intensity and pitch of the speaker on focus. Resultsshow that when input features are sufficiently processed by in-teracting with each other, a linear classifier can reach a similarperformance to a more complex non-linear neural model withseveral hidden layers. Our best models achieve state-of-the-artperformance in the detection task, measured by macro-averagedF1 score.

M3 - Article in proceedings

BT - Gesture and Speech in Interaction (GESPIN 2023)

PB - Max Planck Institut for Psycholinguistics

CY - Nijmegen

ER -

ID: 374969032

Centre for Language Technology