bulk-chain 1.2.2

May 21, 2026 ยท View on GitHub

twitter PyPI downloads

Third-party providers hostingโ†—๏ธ
๐Ÿ‘‰demo๐Ÿ‘ˆ

A no-strings-attached framework for your LLM that allows applying Chain-of-Thought-alike prompt schema towards a massive textual collections using custom third-party providers โ†—๏ธ.

Main Features

  • โœ… No-strings: you're free to LLM dependencies and flexible venv customization.
  • โœ… Support schemas descriptions for Chain-of-Thought concept.
  • โœ… Provides iterator over infinite amount of input contexts

Installation

From PyPI:

pip install --no-deps bulk-chain

or latest version from here:

pip install git+https://github.com/nicolay-r/bulk-chain@master

Chain-of-Thought Schema

To declare Chain-of-Though (CoT) schema we use JSON format. The field schema is a list of CoT instructions for the Large Language Model. Each item of the list represent a dictionary with prompt and out keys that corresponds to the input prompt and output variable name respectively. All the variable names should be mentioned in {}.

Example:

[
  {"prompt": "extract topic: {text}", "out": "topic"},
  {"prompt": "extract subject: {text}", "out": "subject"},
]

Usage

๐Ÿค– Prepare

  1. schema
  2. LLM model from the Third-party providers hostingโ†—๏ธ.
  3. Data (iter of dictionaries)

๐Ÿš€ Launch

API: For more details see the related Wiki page

from bulk_chain.core.utils import dynamic_init
from bulk_chain.api import iter_content

content_it = iter_content(
    # 1. Your schema.              
    schema=[
      {"prompt": "extract topic: {text}", "out": "topic" },
      {"prompt": "extract subject: {text}", "out": "subject"},
    ],
    # 2. Your third-party model implementation.
    llm=dynamic_init(class_filepath="replicate_104.py")(
       api_token="<API-KEY>",
       model_name="meta/meta-llama-3-70b-instruct"),
    # 3. Toggle streaming if needed
    stream=False,
    # 4. Toggle Async API mode usage.
    async_mode=True,
    async_policy='prompt',
    # 5. Batch size.
    batch_size=10,
    # 6. Your iterator of dictionaries
    input_dicts_it=[
        # Example of data ...
        { "text": "Rocks are hard" },
        { "text": "Water is wet" },
        { "text": "Fire is hot" }
    ],
)
    
for batch in content_it:
   for entry in batch:
      print(entry)

Outputs entries represent texts augmented with topic and subject:

{'text': 'Rocks are hard', 'topic': 'The topic is: Geology/Rocks', 'subject': 'The subject is: "Rocks"'}
{'text': 'Water is wet', 'topic': 'The topic is: Properties of Water', 'subject': 'The subject is: Water'}
{'text': 'Fire is hot', 'topic': 'The topic is: Temperature/Properties of Fire', 'subject': 'The subject is: "Fire"'}

API

Methods that accept single prompt:

MethodModeDescription
ask(prompt)SyncInfers the model with a single prompt.
ask_stream(prompt)SyncReturns a generator that yields chunks of the inferred result.
ask_async(prompt)AsyncAsynchronously infers the model with a single prompt.
ask_stream_async(prompt)AsyncAsynchronously returns a generator of result chunks of the inferred result.

Methods that accept batch:

MethodModeDescription
ask_batch(batch)SyncInfers the model with a single prompt.
ask_async_batch(batch)AsyncAsynchronously infers the model with a single prompt.

See examples with models at nlp-thirdgate ๐ŸŒŒ.

Similar Concepts

Primary similarity:

  • LangGraph workflow state
  • ETL/dataflow systems

Minor similarity: