Paper Reading: To Pretrain or Not to Pretrain: Examining the Benefits of Pretraining on Resource Rich Tasks

Venue: ACL 2020 This paper presents the empirical results of how the performance gap between pretraining models (RoBERTa) and vanilla LSTM changes in terms of the size of training samples for text classification tasks. They experimented on 3 text classification datasets with 3 models: RoBERTa, LSTM, LSTM initialized with pretrained RoBERTa embeddings. They used different portion of training samples(1%, 10%, 30%, 50%, 70%, 90%) to … Continue reading Paper Reading: To Pretrain or Not to Pretrain: Examining the Benefits of Pretraining on Resource Rich Tasks

Paper Reading: Tensor Graph Convolutional Networks for Text Classification

The basic notations for GCN are the same with this post. 1. Graph Tensor Definition Here we first describe the formal definition of graph tensor which consists of a series of graphs. is a graph tensor, where and , if and (when ). Where is the i-th graph in the graph tensor, is the set of i-th graph nodes, is the set of the i-th … Continue reading Paper Reading: Tensor Graph Convolutional Networks for Text Classification