May 2023 Community News: Machine learning in Label Studio, free RLHF workshop, and more!

Newsletters May 25, 2023

📚 Get Started with Machine Learning in Label Studio

As machine learning models are only getting larger, the importance of having high-quality datasets to train them also increases. To keep things efficient - we’re seeing more annotation teams integrate ML models directly within the annotation pipeline.

We’re here to help with our latest guide: “Getting Started with Machine Learning in Label Studio.” As a sequel to the popular “Zero to One With Label Studio” tutorial, this guide walks you through how to build machine learning integrations for sentiment analysis and will get you started on how to create custom workflows for your own annotation team.

🖼️ NEW Segment Anything ML Integration

Community member Shivansh Sharma published a new Label Studio machine learning integration that brings the power of Meta’s “Segment Anything” to Label Studio. Read how to quickly isolate and segment nearly any object, significantly reducing the amount of time it takes to outline and label key features in image data.

🚀 Featured Integration: OpenMMLab

The featured integration this month comes from OpenMMLab, an open-source organization focused on advancing the field of computer vision and multimedia understanding. They’ve been active contributors to the Label Studio Community for years, and we’re excited to highlight their open-source ML integration for semi-automated bounded-box labeling.

Virtual Workshop

Getting Started with RLHF

May 30 | 2PM Eastern

Data Scientist in Residence Jimmy Whitaker and the Label Studio community team will lead a technical workshop on “Getting Started with Reinforcement Learning with Human Feedback” this Tuesday, May 30th, at 2 PM ET.

Don't miss this session—you'll walk away with sample code and knowledge to set up an RLHF loop to fine tune generative AI models 💥

Community Shoutouts

Join our community of over + 13K GitHub Stars.

Check out the recent Practical AI Podcast with Daniel Whitenack, featuring Erin Mikail Staples, about creating instruction tuned models.

Kudos to Rodrigo Ceballo Lentini for using Label Studio in a Hackathon 🚀

Shoutout to all the friends, fans and collaborators we’ve had the joy of meeting at recent events, including but not limited to: Brian (AlphaSense), Bianca Henderson (Conda), Gautam Sisoda (State of NY), Andrea Hong (T-Rex Solutions), Johnathan Reimer (crowd.dev) and Swyx (smol.ai)!

Annotations

Making Sense of Not Making Sense. One of the most significant problems with Large Language Models (LLMs) is their propensity to convincingly hallucinate fictional answers to factual questions. A new study from MIT and Columbia University suggests that chatbots can help users think more deeply about the validity of their responses by directly asking users to question the logic behind the result. “People said that the AI system made them question their reactions more and help them think harder,” said MIT’s Valdemar Danry.

Generative AI and the Future of Work. Generative AI, ranging from Large Language Models to image generation platforms, is here and is radically changing the landscape of creative work. The use of LLMs in writing is one of the central concerns in the Writer’s Guild of America strike and will continue to be of concern in many industries as the use of generative AI increases. One editorial by Sarah Kessler and Ephrat Livni in the New York Times argues for optimism, and that AI will complement rather than replace human labor.

Regulation and Algorithmic Accountability. As just another signal of how quickly the AI landscape is evolving, OpenAI CEO Sam Altman testified before the U.S. Senate that he would support a variety of regulations to help make it clear when AI-generated content was being presented to users, help prevent the spread of misinformation, and generally make models safer and more reliable. While the testimony helped bring up the question of who and how should be responsible for the safety and accuracy of large ML models, MIT’s The Algorithm published a breakdown of existing legislation that’s already in the works.