german-reddit
October 6, 2016 ยท View on GitHub
Extraction of a German Reddit Corpus
References
Barbaresi, Adrien (2015). Collection, Description, and Visualization of the German Reddit Corpus, in Proceedings of the 2nd Workshop on Natural Language Processing for Computer-Mediated Communication, pp. 7-11, German Society for Computational Linguistics & Language Technology.
Tools released for the NLP 4 CMC workshop.
Requirements
The whole Reddit corpus is available from archive.org
Requirements: