German Preprocessing [](https://travis-ci.com/jfilter/german-preprocessing) [](https://pypi.org/project/german/) [](https://pypi.org/project/german/)

November 16, 2019 · View on GitHub

Preprocess German texts to do some serious natural-language processing.

  • clean texts
  • remove stopwords (as defined by spaCy)
  • lemmatize
  • lower-case, and remove all punctions, digits are replaced with "0"

Installation

pip install german

Usage

from german import preprocess

preprocess(['Johannes war einer von vielen guten Schülern.', 'Julia trinkt gern Tee.'], remove_stop=True)
# ['johannes gut schüler', 'julia trinken tee']

License

MIT.

Sponsoring

This work was created as part of a project that was funded by the German Federal Ministry of Education and Research.