- 28 Aug 2024
- 3 Minutes to read
- Print
- DarkLight
- PDF
TTS overview
- Updated on 28 Aug 2024
- 3 Minutes to read
- Print
- DarkLight
- PDF
Who should read this article: All users
Use Voiso's text-to-speech (TTS) capabilities to improve the efficiency and flexibility of creating and deploying audio messages for contacts. Text-to-speech audio messaging saves your contact center time and money when compared to traditional audio recordings.
Introduction
Voiso text-to-speech (TTS) is a service that converts your text messages into natural-sounding speech. Voiso leverages Amazon Polly to provide consistent voice production and personalization. You can use text-to-speech to create audio messages in your inbound call flows.
Text-to-speech replaces traditional audio recordings in IVRs to enable you to rapidly develop and manage the audio messages that your inbound callers hear.
Traditional audio messages can be time-consuming and expensive to produce, especially when hiring voice actors and reserving recording studio time. Using text-to-speech eliminates concerns about accents or poor pronunciation affecting message comprehension. All your messages are clear and precise.
Text-to-speech increases your contact center efficiency and flexibility. It is quick and easy to replace or update messages instantaneously without the inconvenience of making a whole new audio recording. Simply change the text and the new message is immediately available in your IVR.
In Flow Builder, text-to-speech uses call flow variables to enable you to personalize messages with customer-specific information without manual intervention. This is particularly useful for delivering customer-specific information, such as account balances or order statuses, which would be impractical with pre-recorded messages.
Voiso's text-to-speech capability also gives you full access to Amazon Polly's multilingual support. This enables you to have messages in multiple languages based on caller preferences.
Languages and voices
Voiso text-to-speech supports all of Amazon Polly's languages and standard voices.
Language support
The following languages are available for text-to-speech synthesis:
- Arabic
- Catalan (Spain)
- Chinese (Mandarin)
- Danish (Denmark)
- Dutch (Netherlands)
- English (United States)
- Finnish (Finland)
- French (Canada)
- French (France)
- German (Germany)
- Italian (Italy)
- Japanese (Japan)
- Korean (South Korea)
- Norwegian Bokmål (Norway)
- Polish (Poland)
- Portuguese (Portugal)
- Spanish (Mexico)
- Spanish (Spain)
- Swedish (Sweden)
- Russian (Russia)
Voice support
Refer to Voice samples for a list of the supported voices and samples of what they sound like.
Speech Synthesis Markup Language (SSML)
Voiso's text-to-speech feature supports Speech Synthesis Markup Language (SSML). SSML enables you to craft the exact sound and feel of the voice messages generated through speech synthesis.
SSML lets you include speech synthesis options such as:
- long pauses
- variable speech rate or pitch
- emphasis of specific words or phrases
- the use of phonetic pronunciation
- breathing sounds
- whispering
Refer to the Amazon Polly Developer Guide for a complete list of supported tags.
SSML is very similar to Hyper Text Markup Language (HTML), so if you are familiar with HTML, it is easy to learn and apply SSML.
Refer to SSML syntax for some examples of how you can improve your text-to-speech voice messages by using SSML.
Flow Builder node support
The following Flow Builder nodes implement text-to-speech:
Additional nodes are coming soon!
Use Case: Account balance inquiry
The following simplified flow presents a use case that you can use as a basis for your own text-to-speech flow.
Scenario
A contact calls into a bank's contact center to inquire about their account balance.
Flow
Here is an example of a simplified call flow for a contact checking their account balance.
- Incoming Call: The contact dials the contact center's inbound number.
- IVR Welcome Message: The DTMF node plays a pre-defined welcome message, such as "To check your account balance, press 1. To speak to a human agent press 2".
- If the contact presses 1, their phone number (ANI) is used by the HTTP Request node to query a web service and store the contact's account balance in a custom variable.
- If the balance is successfully retrieved, the Play Audio node reads a voice message to the contact that includes the custom variable containing the account balance information.
Hello, your account balance is <prosody rate="slow">{{contactBalance}}</prosody.>
<break strength="medium"/>
Thank you for using our automated service!