TTS overview

Prev Next

Who should read this article: Administrators, Supervisors

Use Voiso's text-to-speech to create and update IVR audio quickly and consistently. It saves time and cost compared to traditional recordings.

Introduction

Voiso text-to-speech (TTS) converts written text into natural-sounding speech for your contact center. Voiso uses Amazon Polly for consistent voice quality, multilingual options, and personalization. You can use TTS to create audio messages in your inbound call flows.

TTS replaces traditional recorded prompts in IVR menus so you can build and manage messages quickly. It reduces production time and cost, and it prevents issues with accent or pronunciation. Messages are clear and consistent.

TTS improves efficiency and flexibility. Update a prompt by editing the text and the new audio is available immediately. No new recording is required.

In Flow Builder, TTS supports flow variables so you can insert customer-specific details such as account balances or order status without manual effort.

Voiso also provides full access to Amazon Polly’s multilingual support, enabling messages in multiple languages based on caller preferences.

Languages and voices

Voiso text-to-speech supports all of Amazon Polly's languages and standard voices.

Language support

The following languages are available for text-to-speech synthesis:

  • Arabic
  • Catalan (Spain)
  • Chinese (Mandarin)
  • Danish (Denmark)
  • Dutch (Netherlands)
  • English (United States)
  • Finnish (Finland)
  • French (Canada)
  • French (France)
  • German (Germany)
  • Italian (Italy)
  • Japanese (Japan)
  • Korean (South Korea)
  • Norwegian BokmÃ¥l (Norway)
  • Polish (Poland)
  • Portuguese (Portugal)
  • Spanish (Mexico)
  • Spanish (Spain)
  • Swedish (Sweden)
  • Russian (Russia)

Voice support

Refer to Voice samples for a list of the supported voices and samples of what they sound like.

Speech Synthesis Markup Language (SSML)

Voiso's text-to-speech feature supports Speech Synthesis Markup Language (SSML). SSML enables you to craft the exact sound and feel of the voice messages generated through speech synthesis.

SSML lets you include speech synthesis options such as:

  • long pauses
  • variable speech rate or pitch
  • emphasis of specific words or phrases
  • the use of phonetic pronunciation
  • breathing sounds
  • whispering

Refer to the Amazon Polly Developer Guide for a complete list of supported tags.

SSML is very similar to Hyper Text Markup Language (HTML), so if you are familiar with HTML, it is easy to learn and apply SSML.

Refer to SSML syntax for some examples of how you can improve your text-to-speech voice messages by using SSML.

Flow Builder node support

The following Flow Builder nodes implement text-to-speech:

Audio preview and download

You can preview and download text-to-speech messages that you create in Flow Builder nodes that support this functionality. This short video clip shows you how to preview messages.

You can compose a message in the Message field, then click Play to listen to the message. Change language and voice models as needed, then click Play to hear the message again.

Click Download to save an MP3 file of the spoken message to your local device.

Use Case: Account balance inquiry

The following simplified flow presents a use case that you can use as a basis for your own text-to-speech flow.

Scenario

A contact calls into a bank's contact center to inquire about their account balance.

Flow

Here is an example of a simplified call flow for a contact checking their account balance.

Flow Builder TTS Use Case Account Balance Flow

  1. Incoming Call: The contact dials the contact center's inbound number.
  2. IVR Welcome Message: The DTMF node plays a pre-defined welcome message, such as "To check your account balance, press 1. To speak to a human agent press 2".
  3. If the contact presses 1, their phone number (ANI) is used by the HTTP Request node to query a web service and store the contact's account balance in a custom variable.
    Flow Builder TTS Use Case Account Balance HTTP Request
  4. If the balance is successfully retrieved, the Play Audio node reads a voice message to the contact that includes the custom variable containing the account balance information.
    Flow Builder TTS Use Case Account Balance Play Audio Node
<speak>
Hello, your account balance is <prosody rate="slow">{{customerBalance}}</prosody>
<break strength="medium"/>
Thank you for using our automated service!
</speak>