Paper Reading: Beyond [CLS] through Ranking by Generation
venlue: EMNLP 2020 (link) Previous work that uses pretrained language model (PLM) such as BERT for information retrieval takes the [CLS] embedding of the concatenation of query and document as features for discriminative learning. In other words, the relevance label for a given (query, document) pair is modeled as: where is the [CLS] embedding from the last layer of BERT and is usually a classification … Continue reading Paper Reading: Beyond [CLS] through Ranking by Generation