Automatic speech recognition with Nvidia NeMo
Overview
Combine the power of Label Studio with Nvidia’s NeMo to enhance work done by researchers and practitioners in automatic speech recognition (ASR), text-to-speech synthesis (TTS), large language models (LLMs), and natural language processing (NLP).
With the community-created integration, create audio pre-annotations and automatic transcriptions within a selected speech area within Label Studio.
Benefits
Using NeMo for pre-annotation in Label Studio has the following benefits:
- Reliability: NeMo is capable of providing highly accurate speech-to-text transcriptions.
- Speed: Using NeMo for transcriptions speeds the labeling process by offloading most transcription tasks to a ML model, freeing annotators to work on more difficult transcriptions.