Building Your Own Private Digital Assistant with Snips.AI – Part II

This is a second part in the series “Building Your Own Private Digital Assistant with Snips.AI”:

Part I
Part II
Part III
Part IV

Introduction

So, after shipping ToDos and Reminders in Alice as part of its core scenarios in Yandex.Station earlier this Summer, I thought it’d be fun to try out this very same experience on Snips.AI. This time, however, it will work on top of my product, Zet Universe.

To understand how scenarios fit into the Digital Assistants, let’s take a look onto how they are usually built.

Later in this post, I’ll also share the first steps into building a scenario for your own one. By the end of the note, it won’t actually do anything (yet), but you’ll be able to see it being understood by Snips.AI.

Caution

Yandex Alice works with Russian language, Snips.AI doesn’t support Russian language
Although the very basic idea is the same (two slots, “What” and “When”), due to the differences between languages, I have to build a training set from scratch.
Finally, at this point I’m not connecting this “skill”, or “app” to anything, unlike the real product we’ve done for Alice, so it’s just an experiment.

Short Intro into Personal Intelligent Assistants User Experience

Scenarios, Search Queries, and Skills

From the end-user experience, all assistants can be seen as a single intelligent medium, or agent, that can do the tasks user asks it to do. Logically, we can think of three kinds of such tasks – built-in scenarios, search queries, and 3rd-party skills.

This model allows assistant to minimize friction required for user to solve common tasks (scenarios), get some sort of answer from an assistant to help user when there are no relevant scenarios or skills available (search queries), and get specialized solutions provided by the third party skills.

In other words, when assistant hears something from you, it tries to classify it first:

Is it a scenario?
Is it a search query?
Is it a request to go into a skill?

Hierarchical classification of user queries

The key difference between scenarios and 3rd-party skills is in that all scenarios work in the same space without a need to specify it.* In contrast, when you switch into a particular skill, you’re practically limited to whatever it can do.

For example, in Alice, when you switch to a skill, even its voice is no longer the one provided by Alice. While in skill, you can’t ask Alice to do something without explicitly exiting this skill and whatever context you’ve created during your interaction with it.

With scenarios, it’s different. While dealing with assistant through its scenarios, you can switch from one to another as you please. Let’s look onto scenarios below.

* A caveat here is that in some cases assistant’s creator (say, Google) may build empty common scenarios for things like home automation and other things. By default, assistant won’t be able to actually do this kind of stuff (e.g., “OK Google, turn on the light in the kitchen”). However, when a compatible home assistant system is connected via a public interface to your assistant, it will be able to actually do these tasks.

Scenarios

With scenarios, you can say “Cortana, remind me to buy bread today at 7pm”, and after that, you can continue with “what is the weather this evening”, and Cortana will reply you accordingly.

Clearly, “remind me” is one scenario, while “what is the weather” is another. Assistant can easily shift from one to another, creating an illusion of a single AI system that works with you.

In practice, these scenarios are usually created by different teams, if not organizations inside a corporation. Be that Google, Microsoft, Amazon, or Yandex, the deal is the same. Reminders might be created by Calendar team of the respective company, Music player – by another one, and so on, and so forth.

But to the end user, all of these scenarios are one big set of tasks an assistant can do per user’s request.

Obviously, there are some limitations which I’ll discuss in the later parts. For now, let’s take a look at Snips.AI and try to create a first scenario in it.

Scenarios and Skills in Snips.AI

Snips.AI as a platform might seem to be different when compared to Alice, Cortana, Alexa, Siri, and Google Assistant.

Indeed, Snips.AI currently doesn’t provide an end-user experience like “Cortana” app for Android and iOS, or built-in Siri on iOS. Snips.AI also don’t provide you with a home speaker device, like Amazon Echo, Harmon Kardon Invoke, or Yandex.Station (but you can create one with your own hands).

Therefore, in Snips.AI, you create your own assistant from the ground up, and there’s no distinction between built-in scenarios and third-party skills; there’s also no search queries support, too.

This isn’t bad, by the way. In fact, with Snips.AI you get a unique opportunity to build your own private digital assistant, the way you want. It also means that in order to get an experience comparable to one provided by industry-grade assistants, you’ll have to invest heavily into your solution.

In Snips.AI, there is something called “apps”, or “skills”, but in practice they are more like scenarios in other assistants. You can switch from one scenario to another, and you don’t have to specifically exit one to enter another.

Snips Apps are made of intents (they understand the user’s intention from their voice query) and actions (they trigger the expected behavior from the detected intent).

Let’s try to build one!

First Steps

Sign Up

First of all, you’ll need to create an account on the Snips console. For that, go to https://console.snips.ai/signup.

Hooray, you’re onboard!

Create an Assistant

On https://console.snips.ai/, click on a Create an Assistant button:

Create a New Assistant

Enter your assistant’s name and pick it’s language below:

Enter assistant name and pick it’s language

Excellent! You’ve got it ready! Now, choose your wake word (“Hey Snips”, “Jarvis”, or “Chappie”).

Actually, you can choose your custom wake word, say, “Gerty” (a friendly AI from The Moon Movie), but for the sake of simplicity, pick one of the built-in wake words.

Select a wake word

Time to build your app. After clicking on the “Add an App” button, you’ll be presented with the brand new App Store of Snips apps.

Snips App Store

Right now, it’s a bit small, but, hey, big things have small beginnings, isn’t it?

Ok, not that I wanted to scare you down. Click on the “Create a New App” button:

Give it a name (e.g., “ToDos and Reminders”), and a description (e.g., “It reminds me…”), and click “Create” button.

ToDos and Reminders “App” in Snips.AI

Spectacular, you’ve made your first app. Next step is to create your first intent. In our ToDos and Reminders scenario for Alice, we had several intents:

Create Reminder
Create ToDo
List Reminders
List ToDos

We also supported contextual intents:

Create Reminder | Cancel
Create Reminder | Cancel
List Reminders | Show Next | Show Previous
List ToDos | Show Next | Show Previous

We didn’t support reminder/todo deletion in V1, but we’ve considered these intents for the upcoming iterations.

In this post, we’ll build just one such intent, “Create Reminder”. To do that, click on “Edit App” button, and then click on “Create New Intent” button:

Empty Snips app

Click on “Create New Intent”, and enter it’s name (no spaces) and description:

Now that you’ve created one, let’s think of what do we actually want from our app, and what exactly do we want from this intent:

User should be able to ask our digital assistant to remind her of something, and then receive a reminder at the given time.

Important: Given that assistant works on the device, it will remind our user as long as it has a working power source (actually, we could add batteries to our Raspberry Pi for it to work even in case of power outage).

Empty “app” with one intent, “RemindMe”

In order for our assistant to understand this intent, it should identify two key parts of this intent – “what” to remind about (e.g., “buy bread”), and “when” to remind (“e.g., today at 9 pm”). These parts are usually called slots, and the result of intent “understanding”, actually, is extraction of these two slots from the given phrase.

Let’s add two slots, “what” (text) and “when” (datetime), and mark them as required.

Two slots, “What” and “When”

Excellent. Now, there are two more things needed for the system to work. First is to create training examples for the system to actually understand where these slots in the intent usually are. Second is to create a relevant action that will trigger the expected behavior (e.g., will create a reminder in your calendar).

In this post, we’ll create those training examples, and then create a digital assistant, and, finally, install and check it on the device. In the next posts, we’ll look at creating actual actions.

To create training examples, scroll down to them, and start adding samples:

For each example, select part of the phrase and mark it with the given slot, as shown below:

The more examples you’ll add, the higher is the quality of your intent. I’ve added 23 examples. To give you some scale, for industry-grade assistants, number of examples is many times bigger (thousands).

Fortunately, if you want to automatically generate additional training examples, Snips platform provide you with such a tool. Click on the “Generate” button. In the dialog you’ll be provided with an option to purchase generation of additional examples. Otherwise, you can always establish your own process of generating examples. For instance, at Yandex we’ve used a power of PMs and technology to generate examples for our scenarios, ToDos and Reminders.

Now, the fun part.

First, click on the “Save” button to save your intent.

Second, wait for the system to train your digital assistant, and then try it out in the visual interface on the right pane:

At this point, you can see that at least one of the slots have been identified correctly.

If you have a physical setup, you can now deploy your assistant to your private digital assistant:

On your PC, go to bash console, and enter this:

$ sam login // you'll be asked to enter email and password used on the Snips console on the web; enter them

Then, you can install an assistant created with your account:

$ sam install assistant

Fetching assistants done

Choose the assistant you wish to install on the device (Use arrow keys)

❯ Jarvis_V0

It will then deploy the assistant to your device, and install the intent.

Finally, it’s time to try it out on your physical private digital assistant!

On your PC, write this in console:

$ sam watch // this will enable debug output of your digital assistant

Say your hot word, followed by “remind me to fly to the moon”, and see how it actually works!

Indeed, it takes a lot more training to make voice recognition work well (and, perhaps, I shall enhance my English language speaking skills), but as you can see, the system was able to:

classify scenario as “ToDos and Reminders”
identify intent as “Remind_Me”
fill both slots, what “89th”, and when “2018-09-16 9:35pm”

Voila!

Author: Daniel Kornev

CPO at DeepPavlov.ai. Passionate about Conversational AI & Space Exploration. Founded Zet Universe, Inc. Previously worked at Microsoft, Yandex, Google, and Microsoft Research. This is my older blog (circa 2010), the primary one is at https://danielko.medium.com/ View all posts by Daniel Kornev