GEstures and Head Movements in language (GEHM)
The GEHM network will support cooperation among eight leading research groups working in the area of gesture and language, and thereby foster new theoretical insights into the way hand gestures and head movements interact with speech in face-to-face multimodal communication.
The network has specific focus on three research strands:
- language-specific characteristics of gesture-speech interaction
- multimodal prominence
- multimodal behaviour modelling
- The first research strand, on language-specific characteristics of gesture-speech interaction, will work towards a theory that can account for how speakers’ ability to process and produce gesture and speech is affected and changed by their language profile. Speech-gesture profiles of monolingual and bilingual speakers’ production will be established by combining audio, video and sensor output from motion capture. These rich multimodal data will provide fine-grained information about cross-linguistic differences in native and non-native speech-gesture coordination.
- The second research strand, on multimodal prominence, investigates the theoretical question how linguistic prominence is expressed through combinations of kinematic and prosodic features. In general, it is not yet well understood how gestures and pitch accents might be combined to create different types of multimodal prominence, and how specifically visual prominence cues are used in spoken communication. Datasets will be created and analysed by this research to arrive at a fine-grained and largely documented theory of multimodal prominence.
- The third research strand, on modelling multimodal behaviour, aims at conceptual and statistical modelling of multimodal contributions, with particular regard to head movements and the use of gaze. This research strand will develop models of multimodal behaviour based on the datasets developed in the previous two strands, but also take advantage of existing corpora, including interaction data where eye-gaze has been tracked.
A combination of different methods and empirical materials will be used.
In strand 1, rich multimodal corpora will be produced by combining audio, video and motion capture, and these data will be used to create experimental platforms for testing multimodal comprehension by native, non- native and bilingual users.
In strands 2 and 3, different types of multimodal speech data will be analysed: read television news, unrestricted spontaneous speech and controlled experimental recordings in which motion capture and electromagnetic articulography are used for recording movements of the hands, head, eyebrows, and the articulators (lips, tongue, jaw), while gaze patterns will be studied by means of eye-tracking. Experiments will also be conducted with extracting visual and acoustic features from video-recorded data to develop automatic models of multimodal behaviour through machine learning techniques.
Ambrazaitis, G., Frid, J. and House, D. (2020). Word prominence ratings in Swedish television news readings: effects of pitch accents and head movements. In Proceedings of Speech Prosody 2020, online https://sp2020.jpn.org/
McLaren, L., Koutsombogera, M. and Vogel, C. (2020) A Heuristic Method for Automatic Gaze Detection in Constrained Multi-Modal Dialogue Corpora. In Proceedings of the 11th IEEE International Conference on Cognitive Infocommunications (CogInfoCom), Mariehamn, Finland, 2020, pp. 55-60, doi: 10.1109/CogInfoCom50765.2020.9237883.
McLaren, L., Koutsombogera, M. and Vogel, C. (2020) Gaze, Dominance and Dialogue Role in the MULTISIMO Corpus. In Proceedings of the 11th IEEE International Conference on Cognitive Infocommunications (CogInfoCom), Mariehamn, Finland, 2020, pp. 83-88, doi: 10.1109/CogInfoCom50765.2020.9237833.
Paggio, P., Agirrezabal, M., Jongejan, B. and C. Navarretta (2020). Automatic Detection and Classification of Head Movements in Face-to-Face Conversations. In Proceedings of ONION 2020: Workshop on peOple in laNguage, vIsiOn and the miNd, pages 15–21 Language Resources and Evaluation Conference (LREC 2020), Marseille, 11–16 May 2020. https://www.aclweb.org/anthology/2020.onion-1.3.pdf
Prieto, P., and Espinal, M.T. (2020). "Prosody, Gesture, and Negation". The Oxford Handbook of Negation, ed. by V. Deprez and M.Teresa Espinal. Oxford: Oxford University Press, pp.667-693
Vilà-Giménez, I., and Prieto, P. (in press, 2020). "Encouraging kids to beat: Children's beat gesture production boosts their narrative performance." Developmental Science. First online: https://doi.org/10.1111/desc.12967
Vogel, Carl and Anna Esposito, "Interaction Analysis and Cognitive Infocommunications", Infocommunications Journal, Vol. XII, No 1, March 2020, pp. 2-9. DOI: 10.36244/ICJ.2020.1.1
Zhang,Y., Baills, F., and Prieto, P. (in press). "Hand-clapping to the rhythm of newly learned words improves L2 pronunciation: Evidence from training Chinese adolescents with French words". Language Teaching Research. First online: https://doi.org/10.1177/1362168818806531
Cravotta, A., Busà, M. G., and Prieto, P. (2019). "Effects of Encouraging the Use of Gestures on Speech". Journal of Speech, Language, and Hearing Research, 62, 3204-3219.
Hübscher, I., and Prieto, P. (2019). "Gestural and prosodic development act as sister systems and jointly pave the way for children’s sociopragmatic development". Frontiers in Psychology, 10:1259.
Vilà-Giménez, I., Igualada, A., and Prieto, P. (2019). "Observing storytellers who use rhythmic beat gestures improves children’s narrative discourse performance". Developmental Psychology, 55(2), 250-262. A video abstract of this article can be viewed at: https://www.youtube.com/watch?v=t-EKJIQt20g.
Vogel, Carl and Anna Esposito , Linguistic and Behaviour Interaction Analysis within Cognitive Infocommunications, 10th IEEE International Conference on Cognitive Infocommunications (CogInfoCom), Naples, Italy, 23-25 Oct. 2019, 2019, pp47 - 52 Conference Paper, 2019 DOI: https://doi.org/10.1109/CogInfoCom47531.2019.9089904
Yuan, C., González-Fuente, S., Baills, F., and Prieto, P. (2019). "Observing pitch gestures favors the learning of Spanish intonation by Mandarin speakers". Studies in Second Language Acquisition, 41(1), 5-32.
PI: Patrizia Paggio
The MultiModal MultiDimensional (M3D) labeling system
The M3D labeling system represents a joint effort between three gesture labs (one of the GEHM labs, e.g., the Prosodic Studies Group at Universtitat Pompeu Fabra).
2nd International Workshop on Language Acquisition
The Second International Workshop on Language Acquisition, University of Copenhagen, Denmark.
Other network members
Department of Linguistics and Phonetics at Kiel University:
Division of Speech, Music and Hearing at KTH Royal Institute of Technology:
MIDI group at KU Leuven:
Centre for IMS at Linnaeus University:
Centre for Languages and Literature and Lund University Humanities Lab at Lund University:
Computational Linguistics Group at Trinity College Dublin:
GrEP at Universitat Pompeu Fabra:
University of Malta, Institute of Linguistics and Language Technology: