UW Interactive Data Lab
Papers
Yea-Seul Kim, Jessica Hullman, Matthew Burgess, Eytan Adar
Abstract
Lexical simplification of scientific terms represents a unique challenge due to the lack of a standard parallel corpora and fast rate at which vocabulary shift along with research. We introduce SimpleScience, a lexical simplification approach for scientific terminology. We use word embeddings to extract simplification rules from a parallel corpora containing scientific publications and Wikipedia. To evaluate our system we construct SimpleSciGold, a novel gold standard set for science-related simplifications. We find that our approach outperforms prior context-aware approaches at generating simplifications for scientific terms.
Citation
Yea-Seul Kim, Jessica Hullman, Matthew Burgess, Eytan Adar
Empirical Methods in Natural Language Processing, 2016