Paper Reading: Embedding Words in Non-Vector Space with Unsupervised Graph Learning

venue: EMNLP 2020 This paper proposes to use a graph-based method to train word embeddings. The graph method is PRODIGE which learns a representation of data in a form of a weighted graph G(V,E,w,p). Each edge has a weight and a Bernoulli random variable indicating whether an edge is present or not. The distance between two nodes is formulated as the expected shortest path distance: … Continue reading Paper Reading: Embedding Words in Non-Vector Space with Unsupervised Graph Learning

Paper Reading: Out-of-Vocabulary Embedding Imputation with Grounded Language Information by Graph Convolutional Networks

venue: ACL 2019 The paper proposes a GCN-based method to produce word embeddings for out-of-vocabulary(OOV) words. 1. Graph Construction To construct a knowledge graph, vocabulary is constructed from Wikipedia English dataset (3B tokens). To note that, this vocabulary includes OOV words which are not in the vocabulary of pre-trained embeddings such as GLOVE. For each node/word, they define the concatenation of Wikipedia page summary and … Continue reading Paper Reading: Out-of-Vocabulary Embedding Imputation with Grounded Language Information by Graph Convolutional Networks