Topic | Souce Type | Description | link |
recommendation; IR; | personal website | A lot of datasets about POI recommendation, spatial search, etc. | here |
NLP | website | A collection of NLP datasets. | here |
IR; | Microsoft | LETOR: Learning to Rank for Information Retrieval | here |
IR;QA; | Microsoft | Microsoft Machine Reading Comprehension (MS MARCO) is a collection of large scale dataset for deep learning related to Search. | here |
network; | personal website | Network data sets the author has compiled over the years. | here |
QA; | ACL2017 | TriviaQA: A Large Scale Dataset for Reading Comprehension and Question Answering | here |
opinion mining; | personal website | Full reviews for cars and and hotels collected from Tripadvisor (~259,000 reviews) and Edmunds (~42,230 reviews). | here |
NLP; | CMU course | From Spring 2017 CS292F Deep Learning for NLP. | here |
event; | GDELT database | Events found in the world’s news media and a knowledge graph. | here |
academic graph | Microsoft | Microsoft Academic Graph (MAG) and AMiner. | here |
music search; | personal website | Music data. | here |
IR; click analysis; | TREC Deep Learning Track 2020 | ORCAS: Open Resource for Click Analysis in Search | here |
news recommendation; | competition | MIND News Recommendation Competition | here |