- Get started
- Label Studio features
Install and Upgrade
- Install and upgrade Label Studio
- Install Label Studio Enterprise on-premises using Docker
- Install Label Studio Enterprise on AWS Private Cloud
- Database setup
- Start Label Studio
Security and Privacy
- Secure Label Studio
- Set up authentication
- Federate access to data using SAML
- Set up user accounts
- Manage access
- Get data
- Import pre-annotations
- Sync data from external storage
Labeling and Projects
- Project setup
- Set up your labeling interface
- Label and annotate data
- Review annotations
- Annotation statistics
- Export annotations
Machine Learning Setup
- Machine learning setup
- Write your own ML backend
- ML Examples and Tutorials
- Troubleshoot machine learning
- Frontend library
- Frontend reference
- Backend API
- Update scripts and API calls
Secure Label Studio
Beta documentation: Label Studio Enterprise v2.0.0 is currently in Beta. As a result, this documentation might not reflect the current functionality of the product.
Label Studio provides many ways to secure access to your data and your deployment architecture.
All application component interactions are encrypted using the TLS protocol.
Role-based access control and federated access to cloud storage using SAML are only available in Label Studio Enterprise deployments. Label Studio Enterprise is available as on-premises software that you manage, or as a Software-as-a-Service (SaaS) offering.
If you’re running the open source version in production, restrict access to the Label Studio server. Label Studio establishes secure connections to the web application by enforcing HTTPS and secured cookies. Restrict access to the server itself by opening only the required ports on the server.
Secure user access to Label Studio to protect data integrity and allow changes to be performed only by those with access to the system.
Each user must create an account with a password of at least 8 characters, allowing you to track who has access to Label Studio and which actions they perform.
You can restrict signup to only those with a link to the signup page, and the invitation link to the signup page can be reset. See Set up user accounts for Label Studio for more.
If you’re using Label Studio Enterprise, you can further secure user access in many ways:
- Assign specific roles to specific user accounts to set up role-based access control. For more about the different roles and permissions in Label Studio Enterprise, see Manage access to Label Studio.
- Set up organizations, workspaces, and projects to separate projects and data across different groups of users. Users in one organization cannot see the workspaces or projects in other organizations. For more about how to use organizations, workspaces, and projects to secure access, see Organize projects in Label Studio.
Access to the REST API is restricted by user role and requires an access token that is specific to a user account. Access tokens can be reset at any time from the Label Studio UI or using the API.
Data in Label Studio is stored in one or two places, depending on your deployment configuration.
- Project settings and configuration details are stored in a SQLite or PostgreSQL database.
- Project data and annotations can be stored in the SQLite or PostgreSQL database, or stored in a local file directory, a Redis database, or cloud storage buckets on Amazon Web Services (AWS), Google Cloud Platform (GCP), or Microsoft Azure. Data stored in external storage is accessed by Label Studio using URLs, and the data is not stored in Label Studio directly.
Label Studio does not permit direct access to the SQLite or PostgreSQL databases from the app to prevent SQL injection attacks and other data exfiltration attempts.
Instead, the app uses URIs to access the data stored in the database. These URIs can only be accessed by the Label Studio labeling interface and API because the requests to retrieve the data using those URIs are verified and proxied by Basic Authentication headers.
All specific object properties that are exposed with a REST API are added to an allowlist. The API endpoints can only be accessed with specific HTTP verbs and must be accessed by browser-based clients that implement a proper Cross-Origin Resource Sharing (CORS) policy. API tokens are user-specific and can be reset at any time.
The PostgreSQL database has SSL mode enabled and requires valid certificates.
When using Label Studio, users don’t have direct access to cloud storage. Objects are retrieved from and stored in cloud storage buckets according to the cloud storage settings for each project.
The best way to secure access to cloud storage is to federate access with SAML:
- Set up identity and access management (IAM) policies with your SAML SSO identity provider (IdP).
- Restrict bucket access in Amazon S3 or other cloud storage providers based on the SAML-asserted roles.
- Set up Label Studio Enterprise with the same SAML SSO IdP as the cloud storage provider.
- When Label Studio Enterprise accesses cloud storage buckets on behalf of users, it uses the SAML-asserted roles to retrieve temporary access tokens that match the user permissions.
See Federate access to data in Label Studio using SAML roles.
You can provide cloud storage authentication credentials globally for all projects in Label Studio, or use different credentials for access to different buckets on a per-project basis. Label Studio allows you to configure different cloud storage buckets for different projects, making it easier to manage access to the data. See Sync data from external storage.
Label Studio accesses the data stored in remote cloud storage using URLs, so place the data in cloud storage buckets near where your team works, rather than near where you host Label Studio.
If you use Redis as an external storage database for data and annotations, the setup supports TLS/SSL and requires the Label Studio client to be authenticated to the database with a valid certificate.
Label Studio Enterprise automatically logs all user activities so that you can monitor the activities being performed in the application.
If you found an error, you can file an issue on GitHub!