Building Sense Representations in Danish by Combining Word Embeddings with Lexical Resources

Publikation: Bidrag til bog/antologi/rapportKonferencebidrag i proceedingsForskningfagfællebedømt

Dokumenter

Our aim is to identify suitable sense representations for NLP in Danish. We investigate sense inventories that correlate with human interpretations of word meaning and ambiguity as typically described in dictionaries and wordnets and that are well reflected distributionally
as expressed in word embeddings. To this end, we study a number of highly ambiguous Danish nouns and examine the effectiveness of
sense representations constructed by combining vectors from a distributional model with the information from a wordnet. We establish
representations based on centroids obtained from wordnet synsets and example sentences as well as representations established via
a clustering approach; these representations are tested in a word sense disambiguation task. We conclude that the more information
extracted from the wordnet entries (example sentence, definition, semantic relations) the more successful the sense representation vector.
OriginalsprogEngelsk
TitelGlobalex Workshop on Linked Lexicography : LREC 2020 Workshop Language Resources and Evaluation Conference
Antal sider7
UdgivelsesstedMarseille, France
ForlagEuropean Language Resources Association
Publikationsdato2020
Sider45-52
ISBN (Elektronisk)979-10-95546-46-7
StatusUdgivet - 2020

Antal downloads er baseret på statistik fra Google Scholar og www.ku.dk


Ingen data tilgængelig

ID: 241359613