Real-valued syntactic word vectors

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

We introduce a word embedding method that generates a set of real-valued word vectors from a distributional semantic space. The semantic space is built with a set of context units (words) which are selected by an entropy-based feature selection approach with respect to the certainty involved in their contextual environments. We show that the most predictive context of a target word is its preceding word. An adaptive transformation function is also introduced that reshapes the data distribution to make it suitable for dimensionality reduction techniques. The final low-dimensional word vectors are formed by the singular vectors of a matrix of transformed data. We show that the resulting word vectors are as good as other sets of word vectors generated with popular word embedding methods.

OriginalsprogEngelsk
TidsskriftJournal of Experimental and Theoretical Artificial Intelligence
Vol/bind32
Udgave nummer4
Sider (fra-til)557-579
Antal sider23
ISSN0952-813X
DOI
StatusUdgivet - 3 jul. 2020
Eksternt udgivetJa

Bibliografisk note

Publisher Copyright:
© 2019, © 2019 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group.

ID: 366046134