Ocr
June 23, 2026 ยท View on GitHub
Overview
OCR API
Available Operations
- process - OCR
process
OCR
Example Usage
from mistralai.client import Mistral
import os
with Mistral(
api_key=os.getenv("MISTRAL_API_KEY", ""),
) as mistral:
res = mistral.ocr.process(model="CX-9", document={
"type": "document_url",
"document_url": "https://upset-labourer.net/",
}, bbox_annotation_format={
"type": "text",
}, document_annotation_format={
"type": "text",
}, include_blocks=False)
# Handle response
print(res)
Parameters
| Parameter | Type | Required | Description | Example |
|---|---|---|---|---|
model | Nullable[str] | :heavy_check_mark: | N/A | |
document | models.DocumentUnion | :heavy_check_mark: | Document to run OCR on | |
pages | OptionalNullable[models.Pages] | :heavy_minus_sign: | Specific pages to process. Accepts a list of integers or a string of comma-separated numbers and ranges (e.g. '0,1,2' or '0-5' or '0,2-4'). Page numbers start from 0. | |
include_image_base64 | OptionalNullable[bool] | :heavy_minus_sign: | Include image URLs in response | |
image_limit | OptionalNullable[int] | :heavy_minus_sign: | Max images to extract | |
image_min_size | OptionalNullable[int] | :heavy_minus_sign: | Minimum height and width of image to extract | |
bbox_annotation_format | OptionalNullable[models.ResponseFormat] | :heavy_minus_sign: | Structured output class for extracting useful information from each extracted bounding box / image from document. Only json_schema is valid for this field | Example 1: { "type": "text" } Example 2: { "type": "json_object" } Example 3: { "type": "json_schema", "json_schema": { "schema": { "properties": { "name": { "title": "Name", "type": "string" }, "authors": { "items": { "type": "string" }, "title": "Authors", "type": "array" } }, "required": [ "name", "authors" ], "title": "Book", "type": "object", "additionalProperties": false }, "name": "book", "strict": true } } |
document_annotation_format | OptionalNullable[models.ResponseFormat] | :heavy_minus_sign: | Structured output class for extracting useful information from the entire document. Only json_schema is valid for this field | Example 1: { "type": "text" } Example 2: { "type": "json_object" } Example 3: { "type": "json_schema", "json_schema": { "schema": { "properties": { "name": { "title": "Name", "type": "string" }, "authors": { "items": { "type": "string" }, "title": "Authors", "type": "array" } }, "required": [ "name", "authors" ], "title": "Book", "type": "object", "additionalProperties": false }, "name": "book", "strict": true } } |
document_annotation_prompt | OptionalNullable[str] | :heavy_minus_sign: | Optional prompt to guide the model in extracting structured output from the entire document. A document_annotation_format must be provided. | |
table_format | OptionalNullable[models.TableFormat] | :heavy_minus_sign: | N/A | |
extract_header | Optional[bool] | :heavy_minus_sign: | N/A | |
extract_footer | Optional[bool] | :heavy_minus_sign: | N/A | |
include_blocks | Optional[bool] | :heavy_minus_sign: | Return paragraph-level bounding boxes for all content blocks in the response | |
confidence_scores_granularity | OptionalNullable[models.ConfidenceScoresGranularity] | :heavy_minus_sign: | Granularity for confidence scores: 'word' (per-word scores) or 'page' (aggregate only). Defaults to None (no confidence scores) to keep response payload small. | |
retries | Optional[utils.RetryConfig] | :heavy_minus_sign: | Configuration to override the default retry behavior of the client. |
Response
Errors
| Error Type | Status Code | Content Type |
|---|---|---|
| errors.HTTPValidationError | 422 | application/json |
| errors.SDKError | 4XX, 5XX | */* |