This corpus aims to be the first attempt to create a representative sample of the contemporary Slovak language from various domains with easy searching and automated processing.
It contains a selection of news articles, processed by our NLP tools.
The second part of the effort is the information retrieval evaluation set for the corpus.
This is the first Slovak information retrieval evaluation set. It contains a set of queries (information need) together with corresponding relevant documents from the Slovak Categorized News Corpus.
Please write a request on email@example.com for download link.
Hládek, Daniel, Ján Staš, and Jozef Juhár. "Evaluation Set for Slovak News Information Retrieval." LREC. 2016. PDF