Speech to Text

API Overview

To simplify the integration of different speech-to-text-transcription models (stt)speech-to-text-translation models (stt), OneRouter provides a unified image API.

API Specification

speech-to-text-translation models (stt)

Translates audio into English.

curl https://audio.onerouter.pro/v1/audio/translations \
    -H "Content-Type: multipart/form-data" \
    -H "Authorization: <API_KEY>" \
    --form 'file=@/path/to/file/speech.m4a' \
    --form 'model="whisper-1"'
  • <API_KEY> is your API Key generated in API page.

  • model is the model name, such as whisper-1, available model list can be access in Model page.

  • file is the audio file object (not file name) to transcribe, in one of these formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.

Example response

{
  "text": "Hello, my name is Wolfgang and I come from Germany. Where are you heading today?"
}

speech-to-text-transcription models (stt)

Transcribes audio into the input language.

  • <API_KEY> is your API Key generated in API page.

  • model is the model name, such as whisper-1, available model list can be access in Model page.

  • file is the audio file object (not file name) to transcribe, in one of these formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.

Example response

Last updated