Conversational AI agents learn to pause, take turns, and resume speaking at the right times in this latest product update.

Here’s a question for the ages. How do you deploy conversational AI at scale to handle round-the-clock customer service when it can’t mimic the natural cadence of human language? Maybe that isn’t something that keeps you up at night, but plenty of businesses are finding it an obstacle to advancing AI beyond simple text-based conversations. ElevenLabs aims to alleviate that frustration with its AI 2.0 voice assistants, which can pause at the right time and smoothly take turns conversing with humans.

A natural-sounding conversational agent for high-volume applications

In places like call centers or customer support lines, available hours (and languages) have expanded. Customers want to call when they want to call and conduct business in a language that feels most comfortable to them. Maintaining that capability is prohibitively expensive for a full team of humans, but AI offers a lot of promise. However, previous models didn’t always know when to start or stop talking in a conversation.

The upgrade from ElevenLabs offers a much more natural conversation experience, including all the elements that make a conversation feel human, such as small pauses and knowing exactly when to start speaking again. It employs a turn-taking model designed to eliminate the awkward pauses commonly found in previous iterations by analyzing cues, such as filler words or hesitations, in real time.

It also automates language detection, sensing the target language and handling multilingual conversations without manual configuration. For specialized applications, such as those in healthcare, retrieval augmented generation (RAG) provides AI with access to external knowledge bases with minimal latency.

See also: The Rise of Conversational AI Applications

Managing tasks across different modalities

Customers can interact with agents through voice, text, or some mixture of both. Engineers define agents once, and they’re ready for multimodal operation. It also supports batch calls for outbound outreach.

With context-aware voice agents able to converse naturally and frictionlessly, companies may be able to expand their applications beyond what was previously possible. This is an opportunity for organizations to leverage AI for better customer outcomes and to address the growing demand for customer service and outreach that meets customers where (and when) they are.

See ElevenLab’s documentation here.