Word Re-Embedding via Manifold Dimensionality Retention (bibtex)
by Souleiman Hasan, Edward Curry
Abstract:
Word embeddings seek to recover a Euclidean metric space by mapping words into vectors, starting from words cooccurrences in a corpus. Word embeddings may underestimate the similarity between nearby words, and overestimate it between distant words in the Euclidean metric space. In this paper, we re-embed pre-trained word embeddings with a stage of manifold learning which retains dimensionality. We show that this approach is theoretically founded in the metric recovery paradigm, and empirically show that it can improve on state-of-the-art embeddings in word similarity tasks 0.5 − 5.0% points depending on the original space.
Reference:
Souleiman Hasan, Edward Curry, "Word Re-Embedding via Manifold Dimensionality Retention", In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017, Copenhagen, Denmark, 2017.
Bibtex Entry:
@inproceedings{Hasan2017,
abstract = {Word embeddings seek to recover a Euclidean metric space by mapping words into vectors, starting from words cooccurrences in a corpus. Word embeddings may underestimate the similarity between nearby words, and overestimate it between distant words in the Euclidean metric space. In this paper, we re-embed pre-trained word embeddings with a stage of manifold learning which retains dimensionality. We show that this approach is theoretically founded in the metric recovery paradigm, and empirically show that it can improve on state-of-the-art embeddings in word similarity tasks 0.5 − 5.0% points depending on the original space.},
address = {Copenhagen, Denmark},
author = {Hasan, Souleiman and Curry, Edward},
booktitle = {Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017},
file = {:Users/ed/Dropbox/Work/Papers/publications/EMNLP_17.pdf:pdf},
title = {{Word Re-Embedding via Manifold Dimensionality Retention}},
url = {http://www.edwardcurry.org/publications/EMNLP_17.pdf},
year = {2017}
}
Powered by bibtexbrowser