New Gemini 2.5 capabilities
Native audio output and enhancements to Dwell API
In the present day, the Dwell API is introducing a preview model of audio-visual enter and native audio out dialogue, so you may immediately construct conversational experiences, with a extra pure and expressive Gemini.
It additionally permits the person to steer its tone, accent and elegance of talking. For instance, you may inform the mannequin to make use of a dramatic voice when telling a narrative. And it helps device use, to have the ability to search in your behalf.
You may experiment with a set of early options, together with:
- Affective Dialogue, during which the mannequin detects emotion within the person’s voice and responds appropriately.
- Proactive Audio, during which the mannequin will ignore background conversations and know when to reply.
- Considering within the Dwell API, during which the mannequin leverages Gemini’s considering capabilities to help extra advanced duties.
We’re additionally releasing new previews for text-to-speech in 2.5 Professional and a couple of.5 Flash. These have first-of-its-kind help for a number of audio system, enabling text-to-speech with two voices through native audio out.
Like Native Audio dialogue, text-to-speech is expressive, and may seize actually refined nuances, similar to whispers. It really works in over 24 languages and seamlessly switches between them.