# Audio (TTS & STT) (/docs/audio)


Text-to-Speech (TTS) [#text-to-speech-tts]

```
POST /v1/audio/speech
```

Request [#request]

```json
{
  "model": "model-id",
  "input": "Hello, welcome to the Yunxin API platform.",
  "voice": "alloy",
  "response_format": "mp3",
  "speed": 1.0
}
```

Parameters [#parameters]

| Parameter         | Type   | Required | Description                                |
| ----------------- | ------ | -------- | ------------------------------------------ |
| `model`           | string | Yes      | TTS model ID                               |
| `input`           | string | Yes      | Text to synthesize (max 4096 chars)        |
| `voice`           | string | Yes      | Voice to use                               |
| `response_format` | string | No       | `mp3`, `opus`, `aac`, `flac`, `wav`, `pcm` |
| `speed`           | number | No       | Speed factor (0.25–4.0)                    |

Available Voices [#available-voices]

| Voice     | Description |
| --------- | ----------- |
| `alloy`   | Neutral     |
| `echo`    | Male        |
| `fable`   | British     |
| `onyx`    | Deep male   |
| `nova`    | Female      |
| `shimmer` | Soft female |

Example [#example]

```python
response = client.audio.speech.create(
    model="model-id",
    voice="nova",
    input="Welcome to Yunxin, your unified AI API gateway."
)

with open("output.mp3", "wb") as f:
    f.write(response.content)
```

Speech-to-Text (STT) [#speech-to-text-stt]

```
POST /v1/audio/transcriptions
```

Request [#request-1]

```python
with open("recording.mp3", "rb") as audio_file:
    transcript = client.audio.transcriptions.create(
        model="model-id",
        file=audio_file,
        language="en"
    )

print(transcript.text)
```

Parameters [#parameters-1]

| Parameter         | Type   | Required | Description                                  |
| ----------------- | ------ | -------- | -------------------------------------------- |
| `model`           | string | Yes      | STT model ID                                 |
| `file`            | file   | Yes      | Audio file (mp3, wav, m4a, etc.)             |
| `language`        | string | No       | ISO-639-1 language code                      |
| `response_format` | string | No       | `json`, `text`, `srt`, `verbose_json`, `vtt` |
| `temperature`     | number | No       | Sampling temperature                         |

Translation [#translation]

```
POST /v1/audio/translations
```

Translate audio to English:

```python
with open("chinese_audio.mp3", "rb") as audio_file:
    translation = client.audio.translations.create(
        model="model-id",
        file=audio_file
    )

print(translation.text)  # English translation
```

Audio Models [#audio-models]

<Callout type="info">
  For a list of available audio models and their capabilities, please use the [Models API](/docs/models-api) with `GET /v1/models?type=audio`.
</Callout>
