logofirst
logofirst GitHub

HTML Classification

For content moderation and other use cases where you want to classify HTML content, you can use this template.

Interactive Template Preview

Labeling Configuration

<View>
  <Choices name="toxicity" toName="web_page" choice="multiple" showInline="true">
    <Choice value="Toxic" background="red"/>
    <Choice value="Severely Toxic" background="brown"/>
    <Choice value="Obscene" background="green"/>
    <Choice value="Threat" background="blue"/>
    <Choice value="Insult" background="orange"/>
    <Choice value="Hate" background="grey"/>
  </Choices>

  <View style="border: 1px solid #CCC;
               border-radius: 10px;
               padding: 5px">
    <HyperText name="web_page" value="$text"/>
  </View>
</View>

About the labeling configuration

All labeling configurations must be wrapped in View tags.

The Choices control tag specifies the options to use to classify the website content.

<Choices name="toxicity" toName="web_page" choice="multiple" showInline="true">
  <Choice value="Toxic" background="red"/>
  <Choice value="Severely Toxic" background="brown"/>
  <Choice value="Obscene" background="green"/>
  <Choice value="Threat" background="blue"/>
  <Choice value="Insult" background="orange"/>
  <Choice value="Hate" background="grey"/>
</Choices>

The choice parameter lets annotators select multiple choices, and the showInline parameter displays all the choices in a row. This template provides numerous content moderation choice values, but you can modify the template to provide different choices.

Styling on the View tag adds a border around the website content to make it clear to annotators what is website content:

<View style="border: 1px solid #CCC;
             border-radius: 10px;
             padding: 5px">

The HyperText object tag displays the website content, specified in the text key of Label Studio JSON format or imported as plain text.

<HyperText name="web_page" value="$text"/>