Machine learning backend
You can easily connect your favorite machine learning framework with Label Studio Machine Learning SDK.
That gives you the opportunities to use:
- Pre-labeling: Use model predictions for pre-labeling (e.g. make use on-the-fly model predictions for creating rough image segmentations for further manual refinements)
- Autolabeling: Create automatic annotations
- Online Learning: Simultaneously update (retrain) your model while new annotations are coming
- Active Learning: Perform labeling in active learning mode - select only most complex examples
- Prediction Service: Instantly create running production-ready prediction service
- Create the simplest ML backend
- Text classification with Scikit-Learn
- Transfer learning for images with PyTorch
Check examples in
Here is a quick example tutorial on how to run the ML backend with a simple text classifier:
- Clone repo
git clone https://github.com/heartexlabs/label-studio
- Setup environment
cd label-studio pip install -e . cd label_studio/ml/examples pip install -r requirements.txt
- Create new ML backend
label-studio-ml init my_ml_backend --script label_studio/ml/examples/simple_text_classifier.py
- Start ML backend server
label-studio-ml start my_ml_backend
Run Label Studio connecting it to the running ML backend:
label-studio start text_classification_project --init --template text_sentiment --ml-backends http://localhost:9090
You can confirm that the model has connected properly from the
/modelsubpage in the Label Studio UI.
You should see model predictions in the labeling interface. For example in an image classification task: the model will
pre-select an image class for you to verify.
Model training can be triggered manually by pushing the Start Training button on the
/modelpage, or by using an API call:
curl -X POST http://localhost:8080/api/train
In development mode, training logs will have an output into the console. In production mode, runtime logs are available in
my_backend/logs/uwsgi.logand RQ training logs in
After running this command:
label-studio-ml init my-ml-backend --script label_studio/ml/examples/simple_text_classifier.py
you’ll see configs in
my-ml-backend/ directory needed to build and run docker image using docker-compose.
Ensure all requirements are specified in
my-ml-backend/requirements.txtfile, e.g. place
- There are no services currently running on ports 9090, 6379 (otherwise change default ports in
my-ml-backend/ directory run
The server starts listening on port 9090, and you can connect it to Label Studio by specifying
or via UI on Model page.
The process of creating annotated training data for supervised machine learning models is often expensive and time-consuming. Active Learning is a branch of machine learning that seeks to minimize the total amount of data required for labeling by strategically sampling observations that provide new insight into the problem. In particular, Active Learning algorithms seek to select diverse and informative data for annotation (rather than random observations) from a pool of unlabeled data using prediction scores.
Depending on score types you can select a sampling strategy
- prediction-score-min (min is the best score)
- prediction-score-max (max is the best score)
Read more about active learning sampling on the task page.