Slovak Tokenizer
The purpose of this tool is to normalize any text to a form appropriate for natural language processing.
Requirements
Getting started
- Get and extact the sources
- Make build directory and switch to it
- Run CMake to generate build script
- Run your compiler to build the project
- The program accepts text on standard input and prints the result on standard output
Source Code
Bibliography
If you use this tool, please cite our papper on Slovak Categorized News Corpus