templates
- Template Gallery
- Get Started with Templates
Computer Vision
- Semantic Segmentation with Polygons
- Semantic Segmentation with Masks
- Object Detection with Bounding Boxes
- Keypoint Labeling
- Image Captioning
- Optical Character Recognition (OCR)
- Image Classification
- Visual Question Answering
- Object Detection with Ellipses
- Multi-Image Classification
- Multi-page Document Annotation
- Inventory Tracking
- Visual Genome
Natural Language Processing
- Question Answering
- Sentiment Analysis Text Classification
- Named Entity Recognition
- Taxonomy
- Relation Extraction
- Text Summarization
- Machine Translation
Audio/Speech Processing
- Automatic Speech Recognition
- Sound Event Detection
- Automatic Speech Recognition using Segments
- Signal Quality Detection
- Speaker Diarization
- Dialogue Analysis
- Intent Classification
- Audio Classification
- Audio Classification with Segments
- Voice Activity Detection
Conversational AI
- Response Generation
- Response Selection
- Coreference Resolution and Entity Linking
- Slot Filling and Intent Classification
Ranking and Scoring
- Pairwise Regression
- Document Retrieval
- Pairwise Classification
- Content-based Image Retrieval
- Website Rating
- ASR Hypotheses Selection
- Text-to-Image Generation
- Search Page Ranking
Structured Data Parsing
- Freeform Metadata
- PDF Classification
- Tabular Data
- HTML Entity Recognition
- HTML Classification
Time Series Analysis
- Time Series Forecasting
- Change Point Detection
- Activity Recognition
- Signal Quality
- Outliers and Anomaly Detection
- Time Series Classification
- Time Series Labeling
Videos
- Video Classification
- Video Timeline Segmentation
- Video Object Detection and Tracking
Template Galleries
- Template Gallery - Computer Vision
- Template Gallery - Natural Language Processing
- Template Gallery - Audio/Speech Processing
- Template Gallery - Conversational AI
- Template Gallery - Ranking & Scoring
- Template Gallery - Structured Data Parsing
- Template Gallery - Time Series Analysis
- Template Gallery - Videos
- Template Gallery - Dynamic Labels
Optical Character Recognition (OCR)

Perform optical character recognition (OCR) tasks using a variety of shapes on an image. Use this template to identify regions using shapes and transcribe the associated text for specific regions of the image.
Labeling Configuration
<View>
<Image name="image" value="$ocr"/>
<Labels name="label" toName="image">
<Label value="Text" background="green"/>
<Label value="Handwriting" background="blue"/>
</Labels>
<Rectangle name="bbox" toName="image" strokeWidth="3"/>
<Polygon name="poly" toName="image" strokeWidth="3"/>
<TextArea name="transcription" toName="image"
editable="true"
perRegion="true"
required="true"
maxSubmissions="1"
rows="5"
placeholder="Recognized Text"
displayMode="region-list"
/>
</View>
About the labeling configuration
All labeling configurations must be wrapped in View tags.
Use the Image object tag to specify the image to label:
<Image name="image" value="$ocr"/>
Use the Labels control tag to specify which labels are available to apply to the different shapes added to the image:
<Labels name="label" toName="image">
<Label value="Text" background="green"/>
<Label value="Handwriting" background="blue"/>
</Labels>
You can change the value
of each Label
to assign different labels to regions on the OCR task, such as “Letters” and “Numbers” or something else.
Use the Rectangle control tag to add unlabeled rectangles:
<Rectangle name="bbox" toName="image" strokeWidth="3"/>
Using the Rectangle tag instead of the RectangleLabels tag means that you can have annotators perform OCR annotation in three steps: first by creating regions to highlight text, then associating labels with each region, then transcribing the text for each region. This also makes it easier to add pre-annotations for OCR tasks.
Use the Polygon control tag to add unlabeled polygons:
<Polygon name="poly" toName="image" strokeWidth="3"/>
The strokeWidth
argument controls the width of the line outlining the polygon.
Use the TextArea control tag to add transcripts for each region drawn on the image, whether a rectangle or polygon.
<TextArea name="transcription" toName="image"
editable="true"
perRegion="true"
required="true"
maxSubmissions="1"
rows="5"
placeholder="Recognized Text"
displayMode="region-list"
/>
The editable="true"
argument allows annotators to edit the text after submitting it, and displayMode="region-list"
means that the text boxes appear in the region list associated with each rectangle or polygon, to make it easier to update the text. perRegion="true"
means that each text box applies to a specific region, and required="true"
means that annotators must add text to each text box before they can submit the annotation. The placeholder
argument lets you specify placeholder text that is shown to annotators before they edit the text box.
Related tags

If you found an error, you can file an issue on GitHub!