Label Studio 1.12.0 🚀Automate & Evaluate Labeling Predictions Using LLMs & ML Models

AMA Recap: Heartex CEO Michael Malyuk Chats with BentoML’s Community About MLOps and Data Labeling


Heartex co-founder and CEO Michael Malyuk recently did an AMA over at BentoML’s Slack community to talk through a variety of questions around data labeling, MLOps, and workflows that combine the two. We’ve highlighted (and lightly edited for clarity) some of the questions and answers here.


Q: Among users of Label Studio, what does the distribution of the the team (both in terms of size, #of ppl and roles, dedicated annotators versus data scientists who are doing annotation themselves) look like?

A: This varies a lot given the wide adoption of Label Studio in the industry, and in general, depends on the company's maturity in terms of AI adoption. Companies that are very mature typically treat labeling as a strategic initiative and therefore build in-house annotation teams with their own managers. We've seen small startups with a few data scientists labeling data, as well as companies scaling Label Studio to support hundreds of annotators.

Q: Does Label Studio borrow design principles from qualitative annotation tools, such as maxQDA, nvivo or scrivener?

A: We've put a lot of thought into the design when originally developing the software because one of our main goals for the product was (and still is) to democratize labeling. We want it to be accessible for everyone, therefore the design should be simple. Everyone who has something to contribute should be able to label data. But the software should be flexible enough to support a variety of use cases with varying degrees of complexity, and designing for both sides of the spectrum was quite challenging, and not much could be borrowed.

Q: I was really excited to see the recent integration between Label Studio and Hugging Face spaces. Could you share a bit more about it and what it enables users to build?

A: (Chris Hodge, Head of Community at Heartex) Our goal was to make it as easy as possible for users to get started with Label Studio. Hugging Face Spaces recently released Docker application types, which gave us a fantastic platform to do that.With one click any user is able to replicate the official Label Studio space, and get started with a full-featured deployment of open source Label Studio.With a little bit of extra configuration, you can add your own permanent database and cloud storage (by default, Hugging Face Spaces are ephemeral), and you have a full labeling platform available to share with your annotation team.

With Label Studio as an open source platform, we really believe that it pairs well with the open nature of Hugging Face, and this is just the start of how we want to build in collaboration with the HF team and community on machine learning and data labeling integrations.

The Label Studio Space is here if you want to replicate it, and we have a full tutorial here.

A: (Michael Malyuk) We're trying to minimize any friction for our users to try out the software. Hugging Face has built a great platform to help with getting up and running quickly.

Q: The quality of labeling dictates the quality of the model. How do teams and platforms ensure the quality of the annotations?

A: The process could be very complicated depending on the task you're labeling for and the type of data you're dealing with. Some common approaches are to validate annotators using ground truth annotations, distribute the same labeling task to multiple people and use the ones where annotators agree in their labeling, implement a QA step to validate final results (you can subsample and validate the fraction), programmatically find errors using statistical methods. In real-world enterprise projects, we've seen the best results when you implement all of the above.

Q: Before we dive in deeper, can you tell why you started Heartex, what problems you see and want to fix, and why did you open source Label Studio?

A: I once heard that most founders start companies because they cannot find employment otherwise. Maybe there is some truth in that . But, going back to our story, we were three founders, each with experience building homegrown labeling tools. We noticed that companies and data scientists were getting blocked or stuck repeatedly because the labeling software they had was of low quality. We also recognized that the focus had shifted to data and its quality, and we believed that the market would benefit from professional software. As for open-source, it is democratization. We LOVE open source. We have used it throughout our careers, built a lot with it, and would love others to have the same opportunity that was given to us.

Q: What benefits can open-source software bring to businesses and data scientists, and how can they take advantage of these benefits?

A: LOTS! You can customize, get inspired by the implementation, and use it for free forever. Privacy: your data doesn't go anywhere, and you benefit from a large community of users testing the software. If there are contributions, then everyone benefits from them too.

Q: Can you explain how Label Studio integrates with MLOps toolchains, and how it can help data scientists and machine learning engineers streamline their workflows?

A: From an integration standpoint, there are a few different parts: API, webhooks, and cloud storage/ML model integrations. A typical use case, assuming that the model has already been deployed (we're in the BentoML slack, after all!), would be to route the data through Label Studio and have subject matter experts correct any mispredictions

Q: With the rise of foundation models, more people are looking into fine tuning models. What are your thoughts on this trend from labeling’s perspective? Anything in your pipeline that would capture the use cases?

A: We look at labeling in a quite broad sense; basically, any type of human signal or feedback contributed back into the model is a form of labeling. Following that approach, you can imagine that labeling is of utmost importance to keep fine-tuning the model and also keeping it up to date. If you ask an LLM model that was not updated in time about some facts, it may give a wrong, outdated answer. All of these improvements should be a part of the labeling operation.

Q: What are your thoughts on folks using LLMs to do data annotation? A few text annotation tools that used to rely on weak labels have launched them. With the right prompt, you can prime ChatGPT to do some annotation for you as well.

A: Totally, and I think many use cases will see efficiency improvements in annotation, especially for tasks that require only common knowledge to label. However, you will face similar limitations as when using non-LLM models, such as edge cases and specialized knowledge. Additionally, LLMs could learn incorrect facts and apply them to your labeling tasks. Therefore, you must be careful in how you set up your labeling workflow in terms of QA and verification. But again, there are many efficiency gains to be had from it.

Q: Do you think using a LLM (or perhaps a more specifically trained model) is better than nothing when it comes to labeling? Any plans to incorporate something like this in the future?

A: It's difficult to generalize, as the approach depends on the task, scale, and objective. If I'm labeling for the sentiment model, then I would rely mostly on LLM. However, if I'm labeling complex NERs in the medical field, then I would experiment first. For instance, I might label a small fraction of the data, fine-tune the model, and then implement an active learning loop for the next iteration.

Q: How can Label Studio help companies keep up with emerging trends and technologies in the AI and machine learning spaces, and stay ahead of the curve?

A: What Label Studio enables you to do is encode knowledge into data. If you follow that process, you will get what we call a liquid asset – a dataset that could be used to build a model that has predictive power. Therefore, our customers are looking at it as an investment that they could monetize on later. I think this is a very powerful business model that is made accessible by ML.

Q: Is labeling currently a productivity bottleneck in the pipeline since it is one of the most labor intensive steps? What can labeling platforms like Label Studio do to accelerate labeling without sacrificing accuracy? Maybe through AI?

A: Ha, exactly! Use AI to build AI! One of Label Studio's goals is to optimize the speed and accuracy of capturing human feedback and propagating it to similar items in the dataset that should not be labeled. There are various approaches to achieving this optimization, but it is indeed a labor-intensive and sometimes expensive process that we aim to automate wherever possible.

Q: Do you see problems with biases in the annotations because they can perpetuate and amplify existing societal biases and prejudices? Everyone in the pipeline should be held accountable, but what specific things do you see labeling platforms can do?

A: Great question! I love the topic of bias. We see it all the time in labeling tasks that are more subjective. For example, labeling a car is typically straightforward (though not always), but labeling emotions in text could lead to much more disagreement between people. Labeling tasks that contain something political, racial, or gender-specific could also lead to a lot of biases. In these scenarios, you should heavily invest in your QA and training process, and do as many blindfolded checks as possible. Even the QA process has to be blindfolded, where the QA person doesn’t know whose annotations they are reviewing. Labeling platforms need to provide a way to design those workflows to minimize and capture bias early, as well as programmatically identify it where possible. Ground truth verification is always helpful as well.

Q: How do you see the data labeling and annotation industry evolving over the next 5-10 years, and what opportunities and challenges do you anticipate?

A: This is where I may be biased But we believe that every company will, in a broad sense, become a data labeling company. People will provide feedback and signals to the models, enabling models to do the actual work. It is exciting to see this symbiosis happening, with automation at our fingertips.


A big thank you to BentoML for hosting us! And come join us in the Label Studio Slack community if you have more questions or would enjoy further discussion around data labeling and other MLOps topics.

Related Content

  • Temporary Community Slack Outage

    Messaging on the Label Studio Community Slack is currently unavailable. We are working to resolve the issue.

    Label Studio Team

    December 7, 2023

  • Heartex is now HumanSignal!

    HumanSignal is about the signal that humans provide to models, helping them to adapt, learn, and align with the needs of organizations and society at large.

    Max Tkachenko

    Co-founder, HumanSignal and Label Studio

  • Launching the Label Studio Community Support Archives

    Visit to see solutions to frequently asked questions, tips and tricks, and helpful discussions across the Label Studio Community. These have all been logged with the help of a new Archivist Bot, which will help log popular threads from within the existing community Slack.

    Erin Mikail Staples

    Senior Developer Community Advocate