speech endpoint powered by TTS models, enabling the following features:Important Notice: You must disclose to users that the voice is AI-generated, not a human voice.
| Format | Characteristics | Use Case |
|---|---|---|
| MP3 | Default format | General use |
| Opus | Low latency | Streaming and communications |
| AAC | Efficient compression | Mobile playback |
| FLAC | Lossless compression | Audio archiving |
| WAV | Uncompressed | Low latency apps |
| PCM | Raw sampling | 24kHz, 16-bit signed |
Note: Current voices are mainly optimized for English.