Bayesian Active Learning & Label Studio
Learn the advantages of using Bayesian active learning (BaaL) versus regular active learning featuring Servicenow Applied Research Scientists Frederic Branchaud-Charron & Parmida Atighehchian in conversation with Heartex CEO Michael Malyuk.
Transcript
Ben
All right, I’ll be brief because I’m not going to be doing the interviewing today. We’re fortunate to have our CEO, Michael Malyuk, taking the reins to interview a pair of applied research scientists working at ServiceNow, who have an exciting new library featuring a fictitious character that we’ll get into later. Before I do that, just a couple quick housekeeping items, then Michael, you can take over. And either Frederick or Parmita, whoever wants to present after I stop sharing, you can kind of take the reins from there. First, if there are questions that come up over the course of the interview, please join our Label Studio Slack community. You can ask questions over in our #webinars channel. As usual with these webinars, we try to answer the questions posted there. Also, if you haven’t subscribed to our monthly newsletter, you can do that too—once a month, you’ll get all the latest happenings, content releases, blogs, videos, all in one simple email. One last thing: a new article on using Bayesian active learning, authored by one of our interviewees today, is now live. I hope I didn’t butcher the name of the method. For those who want a deeper dive on today’s topic, you can head to that URL, or find it in our Slack channel in #announcements. I'm going to stop sharing. Michael, take it away.
Michael
Thanks for the introduction. All right, we’re live. I’ll let Frederick and Parmita share their screens. I think I first heard about BAAL and active learning when I was looking at the work you’ve both been doing at ElementAI. We’ve been following it closely and had our own developments in what active learning can bring to the table—how it improves model interaction, data labeling, and performance. I’m excited to have you both here to present your work, and I’m definitely curious to learn more about the name of your library too. So I’ll let you take it from here. I’ll have a lot of questions throughout the presentation—active learning is something we hear about all the time from our community and customers. A lot of Label Studio users are interested in how active learning can improve their model pipelines. So let’s get started.
Frederick
Thank you for having us. We're excited to show our library to everyone watching. Something we want to clarify upfront—yes, people in the industry talk a lot about active learning. Many think it's just a way to save money: label less data, spend less on human labor, and still get a better model. But we come from a different angle. We’re focused on the interaction between the model and the human. That interaction is essential. Think of medical imaging—you wouldn’t want AI to say, “This person has a tumor” without a human validating it. That’s where the human-in-the-loop concept becomes critical. The model helps the human, and the human checks or corrects the model. If a radiologist can go through 300 scans a day instead of 100 because the model flags what to look at, everyone wins. Humans still want transparency—they don't blindly trust a model. We've seen it many times. People ask, “Why did you make that prediction?” and they can catch biases, errors, or fairness issues that the model alone wouldn’t catch. Active learning is a great way to structure that loop. It’s a strong human-AI partnership.
Michael
What you’re saying is interesting because I think most people associate active learning with annotation budget savings. But you’re pointing out how it also improves fairness and reduces bias. That’s a big shift in how we think about its purpose.
Frederick
Exactly. It’s a dynamic field. We just published a paper on this a few months ago showing how it can help with fairness, too.
Parmita
Let me build on that. Why use active learning? There’s a lot of potential. It’s not just about saving on labeling costs. Yes, you can reach the same performance with fewer labels. But it also gives annotators a more interactive experience. They start to understand how the model behaves. If I label it this way, the model does X. If I use these words, it responds differently. That feedback loop is powerful. And then there’s the side effect of creating a more balanced dataset.
Michael
Can you talk a bit more about what a balanced dataset is and why it matters?
Parmita
Of course. Most real-world datasets are imbalanced—some classes are overrepresented, others underrepresented. If you just use random sampling, you oversample the dominant class and underlearn the rare ones. But with uncertainty-based active learning, you surface the examples the model is unsure about, which are often from underrepresented classes. The result is a more balanced dataset. In our experiments—some of which are in our published papers—we saw significant improvements in accuracy across all classes, including the rare ones.
Michael
That makes sense. But what happens if your annotators make mistakes in those early rounds of labeling? Doesn’t that affect the model training downstream?
Parmita
That’s a great question. Interestingly, we tested this. If the noise level in annotations is low, active learning can actually overlook that noise and still perform well. Of course, it depends on the degree. More research is needed, but it was a surprising and promising result.
Frederick
Let me add something. In our NeurIPS paper last year, we showed that regular active learning is sensitive to labeling noise. But Bayesian active learning, like the kind we implement, is much more robust to it. That’s because it uses uncertainty differently.
Parmita
Exactly. Regular active learning usually relies on model confidence to pick the next samples. But models, especially early on, are overconfident. We don’t trust that. Instead, we use Bayesian active learning, which looks at uncertainty across multiple forward passes through the model. We look for disagreement. The more the model’s predictions vary across runs, the more uncertain it is. That’s where the labeler should focus.
Michael
So just to clarify: regular active learning picks points near the decision boundary based on confidence. But Bayesian active learning picks based on disagreement, which can better surface hard or ambiguous examples?
Parmita
Right. Noisy examples often have low confidence but also low uncertainty, so regular methods prioritize them incorrectly. Hard examples can sometimes look confident, even when the model is wrong. Bayesian methods do a better job filtering that out. We group samples based on confidence and uncertainty, and you can see which groups help the model learn best.
Michael
What if you have too many high-uncertainty samples? How do you choose among them?
Frederick
Great question. That’s still an open research problem. But in our library, we’ve included a method called BatchBALD that tries to sample from diverse regions of the dataset rather than focusing all selections in one area. It’s more computationally expensive, but avoids redundancy.
Michael
I was just thinking—you could cluster those samples and pick representatives from each cluster. But that adds complexity. And what if your model takes days to retrain?
Frederick
We’ve thought about that too. One approach is to use a smaller, faster model for the active learning loop. You don’t need to retrain your production-grade model every time. You can transfer the labels to your main model later. Especially in computer vision, this approach is common.
Michael
Smart. You’re basically using a proxy model to guide labeling faster.
Frederick
Exactly. And now, our pitch. We open-sourced BAAL: Bayesian Active Learning library. The mascot is a hybrid of a spider and a cat, inspired by the god Baal in mythology. The library supports multiple acquisition functions like BALD and entropy-based methods. You can mix and match components like MC dropout, deep ensembles, or bring your own uncertainty estimation. The goal is to bridge research and production. We include the latest techniques, and we prioritize speed—our MC dropout implementation is the fastest available. The animation you see in our demo shows how active learning selects samples near class boundaries. That’s where the model learns the most. It avoids wasteful labels and focuses on the hard, useful examples.
Michael
That’s really cool. I hadn’t realized BAAL supports multiple strategies—it’s not a one-size-fits-all library.
Frederick
Exactly. You can plug in your own acquisition function, model, or uncertainty estimator. You can iterate on research quickly, and we try to integrate with common ML tools like PyTorch, HuggingFace, and of course, Label Studio.
Michael
And what’s this animation in the corner?
Frederick
It’s a toy example using CIFAR-10 (airplane, cat, truck). It shows which samples BALD selects over time. You’ll see it focusing on ambiguous regions between classes. Random sampling wouldn’t do that. Active learning refines the decision boundary more effectively.
Michael
Very cool. I also appreciate how responsive you’ve been to the community. This has been a really informative session. We’ll send you some swag and expect some in return.
Frederick
You’ll love the shirt. Even people at ServiceNow can’t get it—it’s exclusive.
Ben
All right, final housekeeping. There’s a QR code on the next slide to find more info, links to our Slack and the article we mentioned earlier. Smash that like button—just kidding. Thanks again to Frederick, Parmita, and Michael. We’ll be back in two weeks with another webinar.
Michael
Thanks everyone!