Real-valued syntactic word vectors
Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › fagfællebedømt
We introduce a word embedding method that generates a set of real-valued word vectors from a distributional semantic space. The semantic space is built with a set of context units (words) which are selected by an entropy-based feature selection approach with respect to the certainty involved in their contextual environments. We show that the most predictive context of a target word is its preceding word. An adaptive transformation function is also introduced that reshapes the data distribution to make it suitable for dimensionality reduction techniques. The final low-dimensional word vectors are formed by the singular vectors of a matrix of transformed data. We show that the resulting word vectors are as good as other sets of word vectors generated with popular word embedding methods.
Originalsprog | Engelsk |
---|---|
Tidsskrift | Journal of Experimental and Theoretical Artificial Intelligence |
Vol/bind | 32 |
Udgave nummer | 4 |
Sider (fra-til) | 557-579 |
Antal sider | 23 |
ISSN | 0952-813X |
DOI | |
Status | Udgivet - 3 jul. 2020 |
Eksternt udgivet | Ja |
Bibliografisk note
Publisher Copyright:
© 2019, © 2019 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group.
ID: 366046134