Introducing APL for Audio (Beta): Build Rich Audio Experiences with New Audio Mixing Capabilities

Catherine Gao Jul 22, 2020

Content Skills Launch Kid Skills Multimodal News Alexa Live

We are excited to announce APL for audio (beta), new audio mixing capabilities that let you use Alexa to mix speech, sound effects, music, and other audio at runtime on all Alexa-enabled devices. These new features extend the existing Alexa Presentation Language (APL) framework so you can easily deliver exciting and immersive audio experiences, without the time, effort, and cost required for pre-mixing. You can create delightful customer experiences such as weather skills that mix sound effects based on the forecast or immersive games that dynamically generate sound effects based on player inputs. You do not have to enable the APL interface to get started. APL for audio is supported on all Alexa devices like Echo devices and third-party devices including Sonos speakers, Bose headphones, and LG TVs. Check out the technical documentation to get started today or go to our code repo on Github to see some examples of APL for audio.

Create Rich Audio Experiences

APL for audio supports the creation of rich audio experiences on all Alexa devices, improving long standing issues such as sample rate and file type limitations. You can use high fidelity audio files with sample rates up to 44.1 kHz and 1411.20 kbps, comparable to music streaming quality. We've also expanded file type support to include .aac, .mp3, .ogg, .opus, and .wav. And you can render up to 15 files in a single response, enabling a more immersive, varied and rich audio experience.

Joao Costa is the sound designer for The Vortex skill by Doppio Games. “Switching to APL for audio on The Vortex has allowed us to have higher quality audio by increasing the sampling rate of our files. We also have more flexibility in terms of the audio format. The new increments in terms of the consecutive audio clips mean that we can build more complex and dynamic dialog interactions. Having the possibility to play and overlap multiple sounds means we can better immerse our users,” said Costa.

Mix Audio and Speech at Runtime

APL for audio allows you to mix audio with Alexa speech and mix multiple voices with sound effects and other audio clips. You can even sync your visuals with layered sound effects and voice-overs that are programmatically generated in response to customer inputs. To create a more natural sounding experience, you can use filters for volume, like fading in and out and trimming.

Animal Rock by Creativity Inc used Voiceflow, which supports APL for audio in its voice app tool, to build the rich audio experience. In the skill, players can create custom musical mixes from four different instruments. “The ability to layer different song parts saved us the step of manually creating hundreds of different versions of each song, for more than 700 possible sets of mixes that users can hear in the game,” said Creativity Inc’s Senior Director of Marketing & Development Caitlin Gutekunst. Braden Ream, CEO of Voiceflow, also said, “the ability to layer advanced audio with our visual builder has unlocked even more ways for designers and developers to experiment and innovate using Voiceflow and Alexa.”

Listen to Audio Samples

The following audio samples were built with APL for audio.

How to Use APL for Audio

You can use the document type APLA to build audio responses from text-to-speech and audio clips using APL components.

You can use the APL authoring tool to build your documents and hear your audio responses right away.

You can use the Alexa.Presentation.APLA.RenderDocument directive so that Alexa will render the audio as part of your skill response. APL for audio works alongside existing outputSpeech and reprompt capabilities.

You can use any of the following APL components in an APLA document:

Audio: Plays a provided audio clip. You can use the ASK sound library URLs or specify a source URL for an audio clip you want to play.

{
    "type": "Audio",
    "source": "soundbank://soundlibrary/animals/amzn_sfx_bird_forest_short_01"
}

Speech: Turns text into speech. You can use the supported SSML tags to include Alexa speech in your skill response. Experiment with Alexa emotions and speaking styles to create more natural and intuitive voice experiences.

{
    "type": "Speech",
    "contentType":"SSML",
    "content": "<speak>${payload.data.ssml}</speak>"
}

Mixer: Play multiple audio and speech types simultaneously.

{
    "type": "Mixer",
    "items": [
        {
            "type": "Speech",
            "content": "Right now it's 80 degrees and sunny."
        },
        {
            "type": "Audio",
            "source": "${payload.data.audio}",
            "filters": [
                {
                    "type": "Volume",
                    "amount": "30%"
                }
            ]
        }
    ]
}

Selector: Use conditional execution to play a specific audio or speech component. The strategy property allows you to select various random selection strategies so you can change the sound effects in each interaction to keep customers engaged.

{
    "type": "Selector",
    "strategy": "normal",
    "items": [
        {
            "type": "Speech",
            "when": "${payload.data.currentTemp > '90'}",
            "content": "${payload.data.forecast} Better wear shorts!"
        },
        {
            "type": "Speech",
            "source": "${payload.data.forecast}"
        }
    ]
}

Sequencer: Play audio and speech files sequentially, one after the other.

{
    "type": "Sequencer",
    "items": [
        {
            "type": "Speech",
            "content": "Right now it's 40 degrees and rainy. Can you hear it?"
        },
        {
            "type": "Audio",
            "source": "soundbank://soundlibrary/nature/amzn_sfx_rain_01"
        }
    ]
}

Silence: Add silent pauses between your other audio or speech components.

{
    "type": "Silence",
    "duration": 1500
}

Get Started Today

Read more about APL for audio (beta), check out the sample skill, and reach out to the Product Manager @austinvach on Twitter if you have any questions!

Introducing APL for Audio (Beta): Build Rich Audio Experiences with New Audio Mixing Capabilities

Create Rich Audio Experiences

Mix Audio and Speech at Runtime

Listen to Audio Samples

How to Use APL for Audio

Get Started Today

Related Content

Related Articles

31 New Features to Unlock More Natural and Immersive Alexa Experiences

Introducing Alexa Conversations (beta), a New AI-Driven Approach to Providing Conversational Experiences That Feel More Natural

Reach More Customers with Quick Links for Alexa (Beta) and New In-Skill Purchasing Options

Subscribe

Alexa Skills Kit

Resources

Alexa Voice Service

AVS Resources

Connected Devices

Agreements

Blogs

Support