NewTemplates and Tutorials for Evaluating Agentic AI Traces

How to build a labeling tool for multi page document routing and splitting

Processing large document packets makes multi-page document routing and splitting a serious bottleneck for data teams. Building custom interfaces to view pages, assign categories, and mark split boundaries drains valuable engineering time. With Label Studio, you can generate a tailored multi-page annotation tool using an AI agent and deploy it directly into your data pipeline.

Convert original PDF documents into arrays of page images to render them natively within a single labeling task.

Apply classification controls to assign routing categories and split markers to specific pages rather than whole documents.

Pre-populate labeling tasks with machine learning predictions to reduce the manual effort required from human annotators.

Extract structured JSON exports containing specific page indexes to feed downstream document separation and routing systems.

Configure cloud storage connections to proxy sensitive document images securely without copying files to local environments.

The problem

Multi-page document routing and splitting demands a highly specific data shape. Annotators must evaluate an entire document packet and apply decisions to individual pages. Constant context switching frustrates annotators. They often have to open separate PDF viewers, track page numbers manually, and log split boundaries in external spreadsheets. Security constraints like the General Data Protection Regulation data minimization principles make building custom viewer tools risky. These tools often require copying sensitive document data across environments. Rebuilding a compliant, paginated image viewer with integrated classification controls from scratch costs engineering teams weeks of lost time.

The short answer

You can use Label Studio as the foundation and have a coding agent generate the labeling interface itself. The agent uses two tools together. The XML labeling config builder skill produces optimized interface configurations from a plain-language specification. The Label Studio SDK/CLI wires that configuration into a real project programmatically. Rather than building a new labeling application from scratch, agents generate the interface from your spec and deploy it into Label Studio in one pass.

Docs: Multi-page template → https://labelstud.io/templates/multi-page-document-annotation.html

Docs: Choices control tag → https://labelstud.io/tags/choices.html

Docs: Data manager → https://labelstud.io/guide/manage_data

Docs: LLM-friendly docs (markdown) → https://labelstud.io/llms.txt

What you're building

Provide a paginated image viewer that allows annotators to flip through document pages natively within a single task.

Include keyboard navigation hotkeys to speed up the process of advancing through long multi-page packets.

Display inline classification controls on each page to assign specific routing destinations like invoices or purchase orders.

Present a split-marker checkbox on every page so annotators can indicate exactly where one logical document ends.

Embed pre-calculated model confidence scores alongside the page images to guide annotators toward low-confidence routing predictions.

Include a side-by-side taxonomy tree to support multi-level hierarchical routing categories for complex mailroom workflows.

How to build it in Label Studio

1. Set up the project

Install a self-hosted instance of Label Studio to ensure multi-page document routing and splitting workflows comply with internal data privacy rules. Each labeling task represents a single document packet formatted as an array of page image URLs. You need to include metadata fields like the original document ID, upload date, and processing batch in the task data so annotators can filter queues effectively. You must also pre-load any required ontology files or reference taxonomy trees to populate the routing classification choices.

2. Generate the labeling interface with the XML config skill

Hand the feature specification from the previous section to a coding agent running the XML labeling config builder skill. The agent processes your instructions and emits a validated Label Studio XML configuration that uses the precise tags required for multi-page document routing and splitting. You can then review the generated interface structure to ensure it maps correctly to your planned document arrays.

<Image name="doc" valueList="$pages"> - renders an array of converted page images into a paginated viewer.

<Choices name="route" toName="doc" perItem="true"> - attaches individual routing classifications to specific pages rather than the entire document.

<Choice value="Invoice"> - defines a specific routing destination or split action for the annotator to select.

<Taxonomy name="category" toName="doc" perItem="true"> - provides a hierarchical list of complex routing destinations for detailed mailroom sorting.

<RectangleLabels name="bbox" toName="doc"> - allows annotators to draw bounding boxes around structural anchors like invoice headers.

3. Wire it into a project with the SDK

The coding agent uses the Label Studio SDK/CLI to create the new project with the generated configuration. It then uploads your array-based document tasks and imports machine learning model predictions as static pre-annotations to speed up the labeling process. This same agent loop can iterate on the configuration continuously. Run a small batch of documents, watch annotators interact with the interface, ask the agent to add new hotkeys, and redeploy the project.

4. Set up review and quality workflows

Establish a review pipeline that isolates your initial page-routing streams from dedicated reviewer queues. You can set a multi-annotator overlap percentage so that complex financial document packets go to multiple team members for consensus. The most important agreement metrics for multi-page document routing and splitting are classification agreement for the page categories and exact-match agreement for the split boundary markers.

5. Export and integrate

Export your labeled data using the default JSON format to capture the full array of page annotations securely. Downstream consumers of multi-page document routing and splitting will rely on the exported item index field to map classification choices to exact page numbers. You hand this JSON output directly to your document processing pipeline so it can programmatically sever the original PDF files at the marked split boundaries.

Why Label Studio for multi-page document routing and splitting

Render multi-page arrays natively to eliminate the context switching of separate document viewers.

Apply per-item choices to bind categories directly to specific page indexes without manual tracking.

Extract structured JSON exports natively to avoid spending engineering time logging boundaries in spreadsheets.

Proxy document images securely from cloud storage to comply with General Data Protection Regulation copying restrictions.

Generate interfaces instantly from specifications to eliminate the lost time of rebuilding custom viewer tools.

Common variations

Document-level classification tasks apply a single categorical label to an entire PDF rather than routing individual pages.

Optical character recognition region review overlays bounding boxes and text areas on native PDFs to correct extraction models.

Receipt processing workflows rely on layout analysis tagging to map specific financial amounts to predefined structural categories.

Content moderation triage loads multi-page reports and uses threshold-based choices to flag inappropriate material.

Next steps

XML labeling config builder skill → https://github.com/HumanSignal/create-xml-labeling-config-skill

Label Studio SDK/CLI → https://api.labelstud.io/api-reference/introduction/getting-started

LLM-friendly docs (markdown) → https://labelstud.io/llms.txt

Multi-page document annotation template → https://labelstud.io/templates/multi-page-document-annotation.html

Doing document annotation in Label Studio → https://labelstud.io/learningcenter/doing-document-annotation-in-label-studio/

Exporting labeled data → https://labelstud.io/guide/export

GitHub → https://github.com/HumanSignal/label-studio

How do you comply with data minimization policies when rendering sensitive document pages?

You can use cloud storage connectors to proxy images directly from Amazon S3 or Google Cloud Storage without copying files to local environments. Configure Identity and Access Management roles to grant read-only access for rendering tasks. This approach ensures you meet General Data Protection Regulation data minimization rules while preventing unauthorized local downloads.

How do you map routing decisions to specific pages instead of the entire document packet?

You must attach the perItem attribute to your Choices or Taxonomy control tags in the XML configuration. This binds the routing category directly to the specific page index rendered by the image array. If you omit this attribute, the interface applies a single label to the entire document.

Why do page image arrays fail to load in the labeling interface?

Cross-Origin Resource Sharing restrictions often block the browser from fetching pre-signed URLs from cloud storage buckets. You must configure your storage bucket or content delivery network to allow requests from your specific Label Studio domain. Without properly configured origin headers, the multi-page viewer cannot preload the document images.

How do downstream splitting systems know where to cut the original document?

Label Studio captures the item index field for every page-level annotation in the standard JSON export. Your downstream document processing pipeline reads this zero-based index to map the reviewer's split marker to the exact page number. You then programmatically sever the original PDF file at those specific page boundaries.

Can you display text recognition cues alongside the page images to guide reviewers?

You can connect a machine learning backend using optical character recognition tools like Tesseract to generate pre-annotations. The backend extracts structural text and sends it to the interface as bounding box proposals before the reviewer opens the task. Reviewers then verify these structural anchors to speed up their routing decisions.

Related Content