NEW10X Faster Labeling with Prompts—Now Generally Available in SaaS

Testing SmolDocling with Label Studio: Evaluating OCR for Document Conversion

Integrations

Most OCR models struggle with more than just text. Documents often contain tables, charts, equations, and structured layouts that are difficult to process accurately. SmolDocling takes a different approach by offering an end-to-end solution for document conversion without relying on large foundational models or complex ensemble pipelines.

As introduced in the paper SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion, SmolDocling is designed to process entire pages while retaining structure, spatial location, and formatting. Unlike traditional OCR models that require multiple specialized components, SmolDocling generates DocTags, a universal markup format that captures all document elements in full context. This makes it more efficient and scalable for a wide range of document types, including business reports, academic papers, patents, and technical documents.

But how well does it perform on real-world data? To help answer that, we have created a Jupyter Notebook that walks you through testing SmolDocling’s OCR capabilities using Label Studio.

Why Evaluating OCR Models Matters

OCR models have improved significantly, but they still face major challenges.

  • Inconsistent recognition of tables, formulas, and charts
  • Misaligned bounding boxes that affect structured data extraction
  • Formatting errors that disrupt readability

SmolDocling aims to solve these issues by providing a compact, vision-language model that processes full-page documents with structured outputs. However, evaluation is critical to measure accuracy and fine-tune results for real-world use.

Try It Yourself

To get started, check out the step-by-step notebook.

By integrating SmolDocling with Label Studio, you can gain insights into how well the model performs and fine-tune results to improve document understanding.

Related Content