OCRPageConfidenceScores

April 7, 2026 ยท View on GitHub

Confidence scores for an OCR page at various granularities.

Note on page-level stats:

  • For 'page' granularity: average/minimum are computed from per-token exp(logprob).
  • For 'word' granularity: average/minimum are computed from per-word confidence, where each word's confidence is exp(mean(token_logprobs)) โ€” a geometric mean over the word's subword tokens.

Fields

FieldTypeRequiredDescription
word_confidence_scoresList[models.OCRConfidenceScore]:heavy_minus_sign:Word-level confidence scores (populated only for 'word' granularity)
average_page_confidence_scorefloat:heavy_check_mark:Average confidence score for the page
minimum_page_confidence_scorefloat:heavy_check_mark:Minimum confidence score for the page