🪐 Project Templates
August 12, 2021 · View on GitHub
🪐 Project Templates
spaCy projects let you manage and share end-to-end spaCy workflows for different use cases and domains, and orchestrate training, packaging and serving your custom pipelines. You can start off by cloning a pre-defined project template, adjust it to fit your needs, load in your data, train a pipeline, export it as a Python package, upload your outputs to a remote storage and share your results with your team.
⚠️ spaCy project templates require spaCy v3. You can install it from pip with
pip install spacyor conda withconda install spacy -c conda-forge. Make sure to use a fresh virtual environment.See the
masterbranch for the previous version of this repo.
🗃 Categories
| Name | Description |
|---|---|
pipelines | Templates for training NLP pipelines with different components on different corpora. |
tutorials | Templates that work through a specific NLP use case end-to-end. |
integrations | Templates showing integrations with third-party libraries and tools for managing your data and experiments, iterating on demos and prototypes and shipping your models into production. |
benchmarks | Templates to reproduce our benchmarks and produce quantifiable results that are easy to compare against other systems or versions of spaCy. |
experimental | Experimental workflows and other cutting-edge stuff to use at your own risk. |
🚀 Quickstart
Projects can be used via the new
spacy project CLI. To find out
more about a command, add --help. For detailed instructions, see the
usage guide.
- Clone the project template you want to use.
python -m spacy project clone tutorials/ner_fashion_brands - Fetch assets (data, weights) defined in the
project.yml.cd ner_fashion_brands python -m spacy project assets - Run a command defined in the
project.yml.python -m spacy project run preprocess - Run a workflow of multiple steps in order.
python -m spacy project run all - Adjust the template for your specific use case, load in your own data, adjust the settings and model and share the result with your team.
👷♀️Repository maintanance
To keep the project templates and their documentation up to date, this repo contains several scripts:
| Script | Description |
|---|---|
update_docs.py | Update all auto-generated docs in the given root. Calls into spacy project document and only replaces the auto-generated sections, not any custom content before or after. |
update_category_docs.py | Update the auto-generated README.md in the category directories listing the available project templates. |
update_configs.py | Update and auto-fill all config.cfg files included in the repo, similar to spacy init fill-config. Can be used to keep the configs up to date with changes in spaCy. |