Real-valued syntactic word vectors

Institut for Nordiske Studier og Sprogvidenskab

Real-valued syntactic word vectors

Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › fagfællebedømt

Basirat, Ali
J. Nivre

We introduce a word embedding method that generates a set of real-valued word vectors from a distributional semantic space. The semantic space is built with a set of context units (words) which are selected by an entropy-based feature selection approach with respect to the certainty involved in their contextual environments. We show that the most predictive context of a target word is its preceding word. An adaptive transformation function is also introduced that reshapes the data distribution to make it suitable for dimensionality reduction techniques. The final low-dimensional word vectors are formed by the singular vectors of a matrix of transformed data. We show that the resulting word vectors are as good as other sets of word vectors generated with popular word embedding methods.

Originalsprog	Engelsk
Tidsskrift	Journal of Experimental and Theoretical Artificial Intelligence
Vol/bind	32
Udgave nummer	4
Sider (fra-til)	557-579
Antal sider	23
ISSN	0952-813X
DOI	https://doi.org/10.1080/0952813X.2019.1653385
Status	Udgivet - 3 jul. 2020
Eksternt udgivet	Ja

Bibliografisk note

ID: 366046134

Center for Sprogteknologi

Real-valued syntactic word vectors

Bibliografisk note