Automatic recognition of the function of singular neuter pronouns in texts and spoken data

Publikation: Bidrag til tidsskriftKonferenceartikelForskningfagfællebedømt

We describe the results of unsupervised (clustering) and supervised (classification) learning experiments with the purpose of recognising the function of singular neuter pronouns in Danish corpora of written and spoken language. Danish singular neuter pronouns comprise personal and demonstrative pronouns. They are very frequent and have many functions such as non-referential, cataphoric, deictic and anaphoric. The antecedents of discourse anaphoric singular neuter pronouns can be nominal phrases of different gender and number, verbal phrases, adjectival phrases, clauses or discourse segments of different size and they can refer to individual and abstract entities. Danish neuter pronouns occur in more constructions and have different distributions than the corresponding English pronouns it, this and that. The results of the classification experiments show a significant improvement of the performance with respect to the baseline in all types of data. The best results were obtained on text data, while the worst results were achieved on free-conversational, multi-party dialogues.

OriginalsprogEngelsk
TidsskriftLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Sider (fra-til)15-28
Antal sider14
ISSN0302-9743
DOI
StatusUdgivet - 2009
Begivenhed7th Discourse Anaphora and Anaphor Resolution Colloquium, DAARC 2009 - Goa, Indien
Varighed: 5 nov. 20096 nov. 2009

Konference

Konference7th Discourse Anaphora and Anaphor Resolution Colloquium, DAARC 2009
LandIndien
ByGoa
Periode05/11/200906/11/2009

ID: 273031295