Introducing Label Studio 1.8.0 — Optimized for fine-tuning LLMs and Foundation Models
Hello Label Studio Users!
We're excited to announce the launch of our latest version of Label Studio, specifically designed to help create datasets for fine-tuning Large Language Models (LLM) like ChatGPT or LLaMA. You can install or upgrade to Label Studio 1.8 here.
LLMs have been progressing at breakneck speed. Open AI's ChatGPT and PaLM2 (that powers Google Bard) are extremely capable but this is just the beginning of broadly available Generative AI. The ability to quickly and effectively fine tune AI models with enterprise-specific data is how organizations will have a competitive advantage once AI becomes pervasive. Label Studio is an incredibly useful tool for allowing enterprises to do this efficiently.
New Ranker Interface
Understanding this need, we’ve introduced a new user interface to make the training and fine-tuning process seamless. If you’re looking to develop an LLM for tasks that require subject matter expertise, or even tuned to your unique business data, Label Studio now equips you with an intuitive labeling interface that aids in fine-tuning the model by ranking its predictions and potentially categorizing them. Take a look:
Additionally, you can classify generated responses indicating quality or relevance. Classification can also aid in error analysis by categorizing the types of mistakes or flaws made by the model. By identifying specific classes of errors, such as grammar mistakes, factual inaccuracies, coherence issues, harmlessness, bias, or sensitive data leakage, you can gain insights into the weaknesses of the model and target those areas for improvement.
Generative AI Templates
We’ve introduced an entirely new section of labeling UI templates for Generative AI use cases, which you can use to quickly setup a project, including:
- Supervised Language Model Fine-tuning
- Human Preference Collecting for RLHF
- Chatbot Model Assessment
- LLM Ranker
- Visual Ranker
[New templates available in project creation]
Core Labeling Experience
This release doesn't stop there! We've also made major updates to our core labeling interface, introducing significant changes to enhance your labeling experience.
Side panels
Our improved side panels can now be stacked in different configurations for your convenience. Those side panels include information about the list of labeled regions, region details and metadata, relations, comments, and history of changes.
They are also configurable! For instance, you can set them as one panel on the right or left, as two panels on the same side, or as two panels on different sides. With the added ability to drag and drop different panels, you can organize your workspace just the way you like it.
Annotations tab carousel
With our latest version, you'll be able to view all the available annotations on your screen simultaneously. We've included predictions in this view as Label Studio treats them as a special kind of annotation (produced by a model, not a human). Each tab now displays high-level information about the annotation, its creator, creation time, and its state, providing a comprehensive overview at a glance.
Control buttons panel
Lastly, we've relocated the Control Buttons Panel to the bottom of the screen for a more logical workflow.
More improvements
Alongside those two major updates, we’re also releasing many other minor improvements, including:
- Updated the sync attribute to allow for synchronization between more than two data sources in audio and video labeling, using the sync=<group-name> to specify synchronization between Audio, Video, Paragraph, and other source elements.
- Optimized request handling for pre-signed cloud storage URLs to enable shared tasks across projects and speed up performance.
- Added status and debug information to cloud storage panels.
- Updated annotation instructions to appear in a modal dialog.
- Updated the user settings screen to be more descriptive.
And we’ve also made a number of bug fixes, including:
- Fixed validation of exports from cloud storage.
- Fixed an issue where predictions from ML backends would not be displayed unless the Task view was refreshed.
- Fixed an issue where updating a duplicated annotation after the original annotation was deleted would cause a runtime error.
- Fixed an issue where resubmitting and previous annotation would update the annotation createdDate.
The full list is available on our GitHub page.
And last but not least, we want to highlight the exciting work done by Shivansh Sharma to integrate the Segment Anything Model with Label Studio for faster image detection. Watch the demo here.
Thank you!
A lot of these updates and new features were based on the feedback we’ve received from our community. We believe these improvements will significantly enhance your data labeling experience, optimizing your process to train and fine-tune your AI models. We're eager for you to try out these new features and look forward to your feedback!
As always, we are here to support you in your data labeling journey. Stay tuned for more updates from Label Studio!
Best,
The Label Studio Team