TTS overview
    • 28 Aug 2024
    • 3 Minutes to read
    • Dark
      Light
    • PDF

    TTS overview

    • Dark
      Light
    • PDF

    Article summary

    Who should read this article: All users

    Use Voiso's text-to-speech (TTS) capabilities to improve the efficiency and flexibility of creating and deploying audio messages for contacts. Text-to-speech audio messaging saves your contact center time and money when compared to traditional audio recordings.

    Introduction

    Voiso text-to-speech (TTS) is a service that converts your text messages into natural-sounding speech. Voiso leverages Amazon Polly to provide consistent voice production and personalization. You can use text-to-speech to create audio messages in your inbound call flows.

    Text-to-speech replaces traditional audio recordings in IVRs to enable you to rapidly develop and manage the audio messages that your inbound callers hear.

    Traditional audio messages can be time-consuming and expensive to produce, especially when hiring voice actors and reserving recording studio time. Using text-to-speech eliminates concerns about accents or poor pronunciation affecting message comprehension. All your messages are clear and precise.

    Text-to-speech increases your contact center efficiency and flexibility. It is quick and easy to replace or update messages instantaneously without the inconvenience of making a whole new audio recording. Simply change the text and the new message is immediately available in your IVR.

    In Flow Builder, text-to-speech uses call flow variables to enable you to personalize messages with customer-specific information without manual intervention. This is particularly useful for delivering customer-specific information, such as account balances or order statuses, which would be impractical with pre-recorded messages.

    Voiso's text-to-speech capability also gives you full access to Amazon Polly's multilingual support. This enables you to have messages in multiple languages based on caller preferences.

    Languages and voices

    Voiso text-to-speech supports all of Amazon Polly's languages and standard voices.

    Language support

    The following languages are available for text-to-speech synthesis:

    • Arabic
    • Catalan (Spain)
    • Chinese (Mandarin)
    • Danish (Denmark)
    • Dutch (Netherlands)
    • English (United States)
    • Finnish (Finland)
    • French (Canada)
    • French (France)
    • German (Germany)
    • Italian (Italy)
    • Japanese (Japan)
    • Korean (South Korea)
    • Norwegian BokmÃ¥l (Norway)
    • Polish (Poland)
    • Portuguese (Portugal)
    • Spanish (Mexico)
    • Spanish (Spain)
    • Swedish (Sweden)
    • Russian (Russia)

    Voice support

    Refer to Voice samples for a list of the supported voices and samples of what they sound like.

    Speech Synthesis Markup Language (SSML)

    Voiso's text-to-speech feature supports Speech Synthesis Markup Language (SSML). SSML enables you to craft the exact sound and feel of the voice messages generated through speech synthesis.

    SSML lets you include speech synthesis options such as:

    • long pauses
    • variable speech rate or pitch
    • emphasis of specific words or phrases
    • the use of phonetic pronunciation
    • breathing sounds
    • whispering

    Refer to the Amazon Polly Developer Guide for a complete list of supported tags.

    SSML is very similar to Hyper Text Markup Language (HTML), so if you are familiar with HTML, it is easy to learn and apply SSML.

    Refer to SSML syntax for some examples of how you can improve your text-to-speech voice messages by using SSML.

    Flow Builder node support

    The following Flow Builder nodes implement text-to-speech:

    Tip

    Additional nodes are coming soon!

    Use Case: Account balance inquiry

    The following simplified flow presents a use case that you can use as a basis for your own text-to-speech flow.

    Scenario

    A contact calls into a bank's contact center to inquire about their account balance.

    Flow

    Here is an example of a simplified call flow for a contact checking their account balance.

    Flow Builder TTS Use Case Account Balance Flow

    1. Incoming Call: The contact dials the contact center's inbound number.
    2. IVR Welcome Message: The DTMF node plays a pre-defined welcome message, such as "To check your account balance, press 1. To speak to a human agent press 2".
    3. If the contact presses 1, their phone number (ANI) is used by the HTTP Request node to query a web service and store the contact's account balance in a custom variable.
      Flow Builder TTS Use Case Account Balance HTTP Request
    4. If the balance is successfully retrieved, the Play Audio node reads a voice message to the contact that includes the custom variable containing the account balance information.
      Flow Builder TTS Use Case Account Balance Play Audio Node
    Hello, your account balance is <prosody rate="slow">{{contactBalance}}</prosody.>
    <break strength="medium"/>
    Thank you for using our automated service!
    

    Was this article helpful?


    What's Next