NEWFine-Tuning OpenAI Models: A Guide 🚀
Back to integrations

Incorporate Versioning into Your Data Process with Pachyderm

Overview

The Pachyderm integration with Label Studio makes incorporating data versioning into your labeling process easier. Pachyderm is a storage platform that maintains versioned data in dedicated data repositories. Using Pachyderm as a data provider in Label Studio allows for annotators to automatically store labels and track changes. The integration allows the labeler to choose when their labeled data gets committed to Pachyderm. This makes the integration more responsive and simpler to work with.

Benefits

  • Full Data Source Integration: Pachyderm offers a full implementation of a Label Studio Data source, with an easy-to-configure connector.
  • Data Versioning: Pachyderm keeps track of versions of the source data, allowing the user to track changes and roll back to previous versions.
  • Selective Committing: The Label Studio integration allows for fine-grained control over when data is updated, allowing labelers to version data immediately as it is labeled, or it batches to control the frequency and number of version updates.

Related Integrations

Azure Blob Storage

Azure cloud storage for data labeling

Google Cloud Storage

Google cloud storage for data labeling

S3

AWS cloud storage for data labeling