OCRPageObject

June 23, 2026 ยท View on GitHub

Fields

FieldTypeRequiredDescription
indexint:heavy_check_mark:The page index in a pdf document starting from 0
markdownstr:heavy_check_mark:The markdown string response of the page
imagesList[models.OCRImageObject]:heavy_check_mark:List of all extracted images in the page
tablesList[models.OCRTableObject]:heavy_minus_sign:List of all extracted tables in the page
hyperlinksList[str]:heavy_minus_sign:List of all hyperlinks in the page
headerOptionalNullable[str]:heavy_minus_sign:Header of the page
footerOptionalNullable[str]:heavy_minus_sign:Footer of the page
dimensionsNullable[models.OCRPageDimensions]:heavy_check_mark:The dimensions of the PDF Page's screenshot image
confidence_scoresOptionalNullable[models.OCRPageConfidenceScores]:heavy_minus_sign:Confidence scores for the OCR page (populated when confidence_scores_granularity is set)
blocksList[models.Block]:heavy_minus_sign:Paragraph-level bounding boxes for all content blocks in reading order (populated when include_blocks is True)