SlideShare a Scribd company logo
1 of 66
Build Voice-Enabled Experiences
with Alexa
@AlexaDevs
Meet the Alexa Family
Meet Alexa
The cloud-based voice service that powers devices like Amazon Echo and Echo Dot
alexa.design/video
The Amazon Alexa Service
Supported by two powerful frameworks that leverage public APIs
Lives In the Cloud
Automated Speech Recognition (ASR)
Natural Language Understanding (NLU)
Always Improving
The Amazon Alexa Service
Supported by two powerful frameworks that leverage open APIs
Lives In the Cloud
Automated Speech Recognition (ASR)
Natural Language Understanding (NLU)
Always Improving
Alexa Skills Kit
(ASK)
Create Great Content
ASK is how you connect
to your consumer
The Amazon Alexa Service
Supported by two powerful frameworks that leverage open APIs
Lives In the Cloud
Automated Speech Recognition (ASR)
Natural Language Understanding (NLU)
Always Improving
Alexa Skills Kit
(ASK)
Create Great Content
ASK is how you connect
to your consumer
Alexa Voice Service
(AVS)
Unparalleled Distribution
AVS allows your content
to be everywhere
Skills built using ASK
Tools that make it fast & easy for you to build skills
Alexa, ask Lyft for a Lyft Line to work
Build Voice-Enabled Experiences with Alexa
Alexa, tell Starbucks start my order
Alexa has skills
Amazon.com/skills
Alexa Resources - cameras out!
bit.ly/alexaquickstart
github.com/alexa
developer.amazon.com/ask
aws.amazon.com
Remember to Check-In
• Ask the instructor for the link
• You’ll get a confirmation email with details to
earn free perks from Amazon
I. Demo
“Alexa, Open
Space Facts”
Alexa, open space facts
Wake Word Starting Phrase Skill invocation
Name
II. Let’s Build
Objective: Create a skill that delivers random facts or quotes
What You Will Learn
• Voice User Interface (VUI) Design
• Intents & Utterances
• one-shot vs multi-turn interactions
• SSML/Speechcons
• AWS Lambda
• Skill Certification
Two sides to an Alexa skill
Alexa skills have two parts – a front-end and a back-end
Creating an Alexa Skill
Voice User Interface Programming Logic
+
Creating an Alexa Skill
+
developer.amazon.com aws.amazon.com
Creating an Alexa Skill
Creating an Alexa Skill
developer.amazon.com
Creating an Alexa Skill
aws.amazon.com
Creating an Alexa Skill
+ developer.amazon.com
Alexa Skill Templates
github.com/alexa
Alexa Project Structure
/SpeechAssets
/IntentSchema.json
/SampleUtterances.txt
/src
/index.js
Fact Skill Template
alexa.design/fact
Open a New Browser Window
1. developer.amazon.com/alexa
2. aws.amazon.com
3. github.com/alexa
with these three tabs:
Echosim.io
Let’s test our skill
Alexa, open space facts
open, begin, start, launch, ask, tell
Wake Word Starting Phrase Skill invocation
Name
Alexa, ask space facts for trivia
UtteranceWake Word Skill invocation NameStarting Phrase
Alexa, ask space facts for trivia
tell me something
give me information
a fact
give me trivia
UtteranceWake Word Skill invocation NameStarting Phrase
III. How it works.
Utterance to intents.
Audio
Cards
Request
Response
Speech Recognition
Automatic Speech
Recognition
fȯr tē tīmz
Automatic Speech
Recognition
fȯr tē tīmz
Forty Times? 40x
Automatic Speech
Recognition
fȯr tē tīmz
Forty Times? 40x
For Tea Times?
Automatic Speech
Recognition
fȯr tē tīmz
Forty Times?
For Tea Times?
For Tee Times?
40x
Automatic Speech
Recognition
fȯr tē tīmz
Forty Times?
For Tea Times?
Four Tee Times?
40x
NLU engine to the rescue
Natural Language Understanding
Sample Utterances
In order to map user
input to a behavior, we
provide training data,
for each intent.
Intent Schema (JSON)
An array of intents.
Each intent is a behavior
for your skill.
Inputs & Outputs
User Audio in. Intents & Slots out.
Wake word detection
Signal processing
Beam forming
Request
Response
Audio
Utterances
JSON
Intents
Request
Response
Response
Request
Text to speech
SSML, streaming audio
JSON
Intents & Utterances
Intents are the Connection
Intents are the Connection - JSON
Intents are the Connection -
Code
Built-in Intents
A library of intents for
common actions.
Amazon provides training
data, but they can be
augmented.
AMAZON.CancelIntent
AMAZON.HelpIntent
AMAZON.StopIntent
AMAZON.NextIntent
AMAZON.NoIntent
AMAZON.RepeatIntent
AMAZON.StartOverIntent
AMAZON.ShuffleOnIntent
AMAZON.YesIntent
REQUIRED FOR
CERTIFICATION
Communicating with the endpoint
Your endpoint needs to receive and react to a JSON object
The Endpoint
Must be Internet-accessible
Adhere to ASK service interface
- JSON
Web service or AWS Lambda
Uses HTTP over SSL/TLS
- port 443
Communicating with the Endpoint
Request body:
• session: Information about the
current conversation
• request: Describes the user input
Communicating with the Endpoint
Response body:
• outputSpeech: Alexa’s response
• card: (optional) graphical response
• reprompt: (optional) reminder
• shouldEndSession: used to end or
keep session open
Types of requests
The journey from user utterance to intents.
Alexa, open space facts
LaunchRequest
Alexa, exit
SessionEndedRequest
IntentRequest : GetNewFactIntent
Alexa, ask space facts for trivia
Alexa SDK: emit, ask, tell
Ask vs Tell
Tell:
Ask:
Present data to user, ends conversation (session).
Wait for user input, doesn’t end conversation (session).
Emit – output speech/event
Speech:
Event:
A way to route behavior in your code.
Alexa Resources - cameras out!
bit.ly/alexaquickstart
github.com/alexa
developer.amazon.com/ask
aws.amazon.com
Remember to Check-In
• Ask the instructor for the link
• You’ll get a confirmation email with details to
earn free perks from Amazon

More Related Content

Recently uploaded

A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 

Recently uploaded (20)

A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 

Featured

Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 

Featured (20)

Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 

Build Voice-Enabled Experiences with Alexa

Editor's Notes

  1. Hello, and welcome to Alexa Workshop.
  2. Alexa Family Echo The Echo is the first and best-known endpoint for Alexa Amazon launched the Amazon Echo in 2014. Echo is really a hands-free speaker with far-field voice recognition, which means you can just talk to it from across the room. The Echo is the first and best-known endpoint of the Alexa Ecosystem. We released Echo in 2014 to allow customers to engage with Alexa and control their home via voice. Alexa and The Echo device was built to make life easier and more enjoyable. Echo Dot: is a hands-free, voice-controlled device that uses the same far-field voice recognition as Amazon Echo. Dot has a small built-in speaker—it can also connect to your speakers over Bluetooth or with the included audio cable. The Echo and the Echo Dot are what we call far-field Alexa devices. You interact with them in a completely hand’s free way from anywhere in the room…even if that room is noisy. The difference between Echo and Echo Dot is simple: Echo has a powerful built-in speaker that provides room filling sound. Echo Dot is smaller and contains a less powerful speaker and works great when connected to another audio system. Both include the same 7 microphone mic-array with advanced beam-forming and noise cancelling technology and are otherwise functionally identical Amazon Tap: Alexa is also available other Amazon devices including Tap, our a portable battery powered speaker Other Alexa is available on Amazon’s Fire Tablets, and Amazon shopping apps on mobile. Alexa is also available on Fire TV via the push-to talk remote control that comes with it.
  3. What is Alexa It is a cloud based service that handles all the speech recognition, machine learning, and Natural language understanding for all Alexa enabled devices. Since it lives in the cloud, is always getting smarter, it’s constantly improving and learning. The more you use it, the more it adapts to your speech patterns, vocabulary, and personal preferences. And because Alexa takes all her intelligence from the cloud, new updates and features are delivered automatically.
  4. We’re now interacting with technology in the most natural way possible – by talking. http://alexa.design/video
  5. There are so many possibilities when it comes to Alexa, and we are really excited about it. With Alexa, we are building a cloud-based voice service that’s free to all developers, companies, and hobbyists. Best of all, you don’t need a background in NLU or speech recognition to build great voice experiences for your customers. Alexa is supported by two sets of APIs & SDKs - Alexa Skills Kit (ASK) is an SDK that allows you to build custom skills that customers can voice enable on all Amazon Alexa products. Many of our customers who build their own smart home products with Alexa also create complementary skills that can be accessed in the Skills Storefront. Alexa Voice Service (AVS): is a set of APIs and developer tools that you can use to build Alexa into your product, whether you’re in the automotive, smart home, or home audio industry. --
  6. On one side we have ASK (Alexa Skills Kit) – an API that allows you as a developer to add more capabilities to Alexa. So when we released Alexa, she’s didn’t have the capability to order an Uber, or order a pizza from Dominos. But what we did was that we opened up the API so these companies could build skills that create rich voice experiences for their customers; We now have over 12000 skills that have been published today, and we expect to see a lot more of these in the future. All you Have to Do Is ASK (What is the Alexa Skills Kit?) The ASK is our SDK, read human….our way of making the voice experience via Alexa possible. ASK gives you the ability to create new voice-driven capabilities (also known as skills, think Apps) for Alexa using the new Alexa Skills Kit (ASK). You can connect existing services to Alexa in minutes with just a few lines of code. You can also build entirely new voice-powered experiences in a matter of hours, even if you know nothing about speech recognition or natural language processing.
  7. On the other side is AVS (Alexa Voice Service), - set of APIs that allow you to integrate Alexa in to your devices and apps. So think cars, microwaves, refrigerator, speaker or the likes. As long as your device has a microphone, speaker and internet connection, you can integrate Alexa. In fact we recently released a Raspberry Pi version of Alexa using the AVS APIs. AVS: Serving a Platform Agnostic Voice Experience It’s through the Alexa Voice Service that, device makers and hardware manufacturers can incorporate an Alexa-driven voice experience into their devices. Any device that has a speaker, a microphone, and an Internet connection can integrate Alexa. Just imagine what that means. You can picture everything from a car to a microwave to a pen, and more...all enabled to deliver an experience by voice Both ASK and AVS are completely free to use. Here’s a rule of thumb to understanding what feature set makes sense for your use case: You can add your product to Alexa through the Alexa Skills Kit (ASK) Or, you can add Alexa to your product with the Alexa Voice Service (AVS),
  8. Let’s switch gears now and talk a bit about Skills – which is really capabilities that Alexa has. What is a Skill? Skills are how you, as a developer, make Alexa smarter. They give customers new experiences. They’re the voice-first apps for Alexa. When we launched Echo, Alexa could do the basics - weather, music, read the news, but now you can Lyft, Dominos etc. There are two kinds of skills built in skills (like playing music, weather forecast, general knowledge questions) and custom skills that you as developers can build. Building skills using Alexa Skills Kit (ASK) The way you build skills is by using the Alexa Skills Kit. The Alexa Skills Kit is a collection of self-service APIs, tools, documentation and code samples that make it fast and easy for you to add skills to Alexa. Thousands of developers are building skills to expand Alexa’s capabilities. We launched the Alexa Skills Kit so anyone can develop Skills for Alexa, at no cost. Very similar to Apps on your phone, except that nothing gets installed on the device. What can you do with ASK You can connect existing services to Alexa You can also build entirely new voice-powered experiences in a matter of hours, even if you know nothing about speech recognition or natural language processing.
  9. When Alexa launched in the US , it had dozens of capabilities or skills, and has now thousands of capabilities. You can now say “Alexa, ask Lyft for a Lyft Line to work”
  10. Or Alexa, ask Capital One what did I spend?
  11. Or Alexa, tell Starbucks to start my order
  12. The free Amazon Alexa App is a companion to your Alexa device for setup, remote control, and enhanced features. Alexa is always ready to play your favorite music, provide weather and news updates, answer questions, create lists, and much more. You can also visit amazon.com/skill to view the complete catalog of skills.
  13. Let’s see the Fact Skill in action before we start building it. Talk to Alexa – quick demo of the fact skill. “Alexa, open space facts” You can also say – Alexa, tell me a fact, or Alexa, give me a space fact
  14. Wake Word - A command that the user says to tell Alexa that they want to talk to her. Example: “Alexa, ask History Buff what happened on December seventh.” Here, “Alexa” is the wake word. Alexa users can select from a defined set of wake words. Starting Phrase – open, ask, begin, play, start, talk, tell etc. Invocation Name: A name that represents the custom skill the user wants to use. Users say a skill’s invocation name to begin an interaction with a particular custom skill. For example, if the invocation name is “Daily Horoscopes”, users can say:User: Alexa, ask Daily Horoscopes for the horoscope for Gemini You must say the name of the skill as part of the user utterance. That’s the way Alexa can map it to the appropriate skill. It’s like launching a mobile app. You have to open the app to use the specific functionality.
  15. Much like the web and mobile apps, there are two pieces to building an Alexa skill.
  16. Alexa skills have two parts: Configuration data in Amazon Developer Portal (Frontend) Hosted Service responding to user requests (Backend)
  17. Alexa skills have two parts: Configuration data in Amazon Developer Portal (Frontend): done at developer.amazon.com Hosted Service responding to user requests (Backend): we’ll be using AWS Lambda as our backend, so we’ll do this at aws.amazon.com
  18. Work in progress: This slide will be tweaked. Create VUI Interaction Model. Front-end = Skill Info + Interaction Model Lambda – Your code or your hosted service backend Connect VUI to code Testing Customization – Make it your own Publish it
  19. Create VUI Interaction Model (Front End) Skill Info + Interaction Model Create AWS Lambda Function: Your code or your hosted service backend Connect VUI to the Lambda Function Testing Customization: Make it your own Publish it
  20. Create VUI Interaction Model (Front End) Skill Info + Interaction Model Create AWS Lambda Function: Your code or your hosted service backend Connect VUI to the Lambda Function Testing Customization: Make it your own Publish it
  21. Create VUI Interaction Model (Front End) Skill Info + Interaction Model Create AWS Lambda Function: Your code or your hosted service backend Connect VUI to the Lambda Function Testing Customization: Make it your own Publish it
  22. GitHub templates
  23. A typical Alexa project on GitHub has the following structure: /SpeechAssets Provides the VUI or the Front End for the skill Meant to go inside your skill at developer.amazon.com /src Provides the code for the skill Meant to go into your Lambda Function at aws.amazon.com
  24. About this Skill This sample covers the basics of skill building. It delivers random facts or quotes and serves as a very simple example. You can also customize your fact skill with your favorite topic.  Concepts you will learn with this skill Intents and Intent Schema Sample Utterance Generating a randomized response from Alexa
  25. We’ll be switching between these as we build our skill
  26. Visit echosim.io, and login using Amazon
  27. As a developer you are never asked to work with audio or raw text coming from the user. You receive a JSON object that was generated by the Alexa Service, this is how it works.
  28. This is “bird’s eye” view of a user interacting with a custom skill through an Echo. We will go into further detail latter in this presentation, but it’s important to remember that Alexa and all skill’s code live in the cloud.
  29. In order to understand what a user says, we first have to turn sounds into words. This process is called speech recognition.
  30. In this example we have the phonetic spelling for three sounds. Let’s see what words these could form.
  31. Forty times? Maybe the user wants to multiply something by forty
  32. For tea times? Is the user searching for good times to have some tea?
  33. For Tee Times? Does the user want to play golf?
  34. Or does the user want to play a lot of golf
  35. Having a Natural Language Understanding Engine on top of speech recognition allows us to go from words to meanings. We can also train this engine using utterances and slots to map user input with high accuracy.
  36. The way we train the NLU engine is by using sample utterances, that we associate to an intent
  37. Each intent define a specific behavior your skill can take, like buttons on a web page, they take user input and execute some code based on it.
  38. Let’s take a step-by-step look at how user input, in the form of spoken word (audio), is turned into a JSON object that our code can read and respond to.
  39. - The first thing that has to happen is for the device to “wake up” when it hears the correct word. - Once the device is awake it’ll stream all the audio to the Alexa Service hosted by AWS in the cloud. - Alexa devices like the Amazon Echo, or the Echo dot feature microphone arrays, these allow us to capture high quality audio, by using beam forming and canceling background noise.
  40. Once the audio reaches the Alexa Service, it is converted into a JSON object, based on the meaning of the words the user spoke. This JSON object is easy to read from any programing language and contains enough information to allow us to respond to the user’s input.
  41. Your code just has to return a (properly formatted) JSON object to the Alexa Service and the service will take care of turning it into audio and routing it to the correct user and device. Your response can contain plain text, Speech Synthesis Markup Language (SSM) and references to audio files to be played.
  42. Intents are the behaviors your skill can take. Sample Utterances are training data used to map user input to each behavior. The name of an intent is what connects everything together.
  43. Here we can see the intent schema &sample utterances side-to-side. As you can see, the thing they have in common is the name of the intent.
  44. Here is an example of a JSON object that would get sent to your code. Here we can see that there is an intent component that has a name and it exactly matches what we had in our intent schema
  45. Since we are using the alexa-sdk in our code we define an handler for an event that matches the intent name. The intent name connects everything together, the intent schema, the training data, the JSON object and the code.
  46. Along with custom intents, Amazon provides a series of “built in” intents you can leverage, these intents don’t require any training data. The 3 highlighted intents are required for skill publishing.
  47. We use the term endpoint to describe your code along with were it’s hosted.
  48. You can leverage any programming language and hosting technology to build your endpoint. The only requirement is that you securely receive and send JSON in the correct format. The easiest way to host your endpoint is using AWS Lambda
  49. This is an example of JSON object generated by the Alexa Voice Service based on user input. The request body has two main components session and request. The session object has information about the current conversation, including what user and skill made the request. Request contains the payload LaunchRequest IntentRequest SessionEndedRequest
  50. This is an what a JSON object generated by your endpoint should look like. It can be broken down into the following components: outputSpeech: This is the message users will hear as interpreted by Alexa. card: Optional graphical component, that will be rendered and stored in the Alexa mobile app and alexa.amazon.com reprompt: Optional message to remind a user we are waiting for input, if timeout is met. shouldEndSession: Indicates if service should wait for user input by keeping the session open or end it.
  51. There are three main types of requests, the Alexa Service will generate the appropriate type based on the users input
  52. The sentence at the top is turned into a LaunchRequest, this is analogous to opening an app or website, we are just launching third party functionality.
  53. This type of request is sent back so you can do any necessary cleanup and store data.
  54. This example showcases how a single command from the user can wake up a device, launch a custom skill and trigger functionality within it. The JSON object for this command would have the type listed as an IntentRequest and the intent name for this example would be GetNewFactIntent
  55. The alexa-sdk gives us a series of tools that make working with JSON objects a lot easier, although it is not required for developing skills, it makes a huge difference. It is packaged as node module distributed through NPM.
  56. The SDK works as an event emitter and provides easy ways for us to declare and attach handlers for events. It also allows us to quickly create responses by emit and event that contains :tell or :ask removing the need for us to craft JSON by hand.
  57. Besides emitting an event that gets turned into a response, we can also use the emitter and handlers to control de code flow. We can emit any event and as long as we have a handler for it, we can trigger any of our codes functionality.