How Well can We Learn Interpretable Entity Types from Text?

Publikation: Bidrag til bog/antologi/rapportKonferencebidrag i proceedingsForskningfagfællebedømt

  • Dirk Hovy
We investigate a largely unsupervised approach to learning interpretable, domain-specific entity types from unlabeled text. It assumes that any common noun in a domain can function as potential entity type, and uses those nouns as hidden variables in a HMM. To constrain training, it extracts co-occurrence dictionaries of entities and common nouns from the data. We evaluate the learned types by measuring their prediction accuracy for verb arguments in several domains. The results suggest that it is possible to learn domain-specific entity types from unlabeled data. We show significant improvements over an informed baseline, reducing the error rate by 56%.
OriginalsprogEngelsk
TitelProceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Udgivelses stedBaltimore, Maryland
ForlagAssociation for Computational Linguistics
Publikationsdato2014
Sider482-487
StatusUdgivet - 2014

ID: 107672668