Audio Transcription
- The extraction graph creates an endpoint which accepts audio files and transcribes them using OpenAI's Whisper model
- You can continuously transcribe audio with this pipeline by uploading audio files to indexify server.
- You can run 1000s of instances of the extractors in parallel transcribe audio in a fault tolerant manner.
Code Reference
graph.yaml
- contains the extraction graph.setup_graph.py
- Sets up the extraction graph in Indexify Serverupload_and_retrieve.py
- Uploads audio into the extraction graph, waits for extraction and finally retrieves from the endpoint.
Download & Start Indexify Server
Download & Join Indexify Extractors
Terminal 2
virtualenv ve
source ve/bin/activate
pip install indexify-extractor-sdk
indexify-extractor download tensorlake/whisper-asr
indexify-extractor join-server
Setup the Graph
Upload Data and Retrieve
The next step is to upload an audio file and retrieve the transcript