README.md

July 9, 2021 · View on GitHub

Kashgari

GitHub Slack Coverage Status PyPI

Overview | Performance | Installation | Documentation | Contributing

🎉🎉🎉 We released the 2.0.0 version with TF2 Support. 🎉🎉🎉

If you use this project for your research, please cite:

@misc{Kashgari
  author = {Eliyar Eziz},
  title = {Kashgari},
  year = {2019},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/BrikerMan/Kashgari}}
}

Overview

Kashgari is a simple and powerful NLP Transfer learning framework, build a state-of-art model in 5 minutes for named entity recognition (NER), part-of-speech tagging (PoS), and text classification tasks.

  • Human-friendly. Kashgari's code is straightforward, well documented and tested, which makes it very easy to understand and modify.
  • Powerful and simple. Kashgari allows you to apply state-of-the-art natural language processing (NLP) models to your text, such as named entity recognition (NER), part-of-speech tagging (PoS) and classification.
  • Built-in transfer learning. Kashgari built-in pre-trained BERT and Word2vec embedding models, which makes it very simple to transfer learning to train your model.
  • Fully scalable. Kashgari provides a simple, fast, and scalable environment for fast experimentation, train your models and experiment with new approaches using different embeddings and model structure.
  • Production Ready. Kashgari could export model with SavedModel format for tensorflow serving, you could directly deploy it on the cloud.

Our Goal

  • Academic users Easier experimentation to prove their hypothesis without coding from scratch.
  • NLP beginners Learn how to build an NLP project with production level code quality.
  • NLP developers Build a production level classification/labeling model within minutes.

Performance

Welcome to add performance report.

TaskLanguageDatasetScore
Named Entity RecognitionChinesePeople's Daily Ner Corpus95.57
Text ClassificationChineseSMP2018ECDTCorpus94.57

Installation

The project is based on Python 3.6+, because it is 2019 and type hinting is cool.

Backendkashgari versiondesc
TensorFlow 2.2+pip install 'kashgari>=2.0.2'TF2.10+ with tf.keras
TensorFlow 1.14+pip install 'kashgari>=1.0.0,<2.0.0'TF1.14+ with tf.keras
Keraspip install 'kashgari<1.0.0'keras version

You also need to install tensorflow_addons with TensorFlow.

TensorFlow Versiontensorflow_addons version
TensorFlow 2.1pip install tensorflow_addons==0.9.1
TensorFlow 2.2pip install tensorflow_addons==0.11.2
TensorFlow 2.3, 2.4, 2.5pip install tensorflow_addons==0.13.0

Tutorials

Here is a set of quick tutorials to get you started with the library:

There are also articles and posts that illustrate how to use Kashgari:

Examples:

Contributors ✨

Thanks goes to these wonderful people. And there are many ways to get involved. Start with the contributor guidelines and then check these open issues for specific tasks.