What is a conversation?
Before we dive into the technicals, let's break down what it means to have a conversation - with a computer or with any human.
A conversation starts with you asking something - a "request", and then getting a "response" back based on what you asked + the context of what happened prior.
Now, the request can take a variety of shapes - it could be a voice phrase, selecting a number on the dial-pad, or pressing a button on a chatbot. And the response could be as simple as a line of plain text or as complex as images, videos, or even performing an action on an app.
So a conversation, regardless if it's with Alexa, Google Assistant, Facebook Messenger, Chatbots, or IVRs, looks like a series of request, response, request, response, request, response...
If you're familiar with API calls, you can see how it's easy to adapt this model into an API interface.
Getting starting on Voiceflow
Your Voiceflow project is naturally a fluid conversation model. If you've run your project as a prototype, you'll see that we request it to launch and get back a response, which is a series of instructions. The response goes on until the next prompt or choice block, where we wait for the next user interaction. From there, you send another request, the next response follows, and so on.
It's a little easier looking at an example with the turns labelled:
The blue is the user request and everything else is part of the response. Turn 1 has an implicit request, which is to launch the conversation.
This is a simpler example, but Voiceflow is not limited to being a linear flowchart. You can create complex, open-ended conversations that allow the user to switch contexts and jump around, even with no lines between blocks! For more info, learn about Intent Block.
💡 What's interesting is that the Voiceflow test tool calls the exact same API described here - so there's nothing stopping you from making something even better than the test tool!
Calling the API
A web API is just a link where you go to retrieve things, so if I ask a weather API what the weather will be two days, it gives me a back a report. If I ask the Voiceflow API to reply to "Tyler" after he requests a pizza, I get back a response.
It's almost as if you're speaking with the API, with each API request representing a turn in the conversation.
There are a few pieces of key information that you need to give to the API endpoint every time:
- versionID - this helps identify the particular Voiceflow project you are running.
- API Key - this authenticates you, so someone can't just spam your project.
- userID - to keep track of who is talking to the API. Where
user 2
starts their conversation could be a totally different section thanuser 1
. Each user has their own context and progression within a conversation. We'll talk about this more in the state vs stateless section. Make sure to URI encode this value. - request - the actual user action. This could be what they typed, what they said, launching the project, a button pressed, etc.
The format that the request is in is a JSON object called request
with type
and payload
property, while the response is a JSON array of traces
, each also with a type
and payload
. (You can also make custom requests and traces on your Voiceflow project)
Here's what it looks like:
Just keep following it up with additional API calls, and now you've created a conversation!
A great place for resources can be found on the left bar, under Integration > Developer > API. This will be the central portal for managing your API credentials and getting tips.
If this all makes sense, and you're ready to get started, check out our code examples and in depth documentation! This will describe all the specific types of requests and responses you might get.
Building out
Everything so far in this article has been pretty abstract, but we wanted to give you a small taste of what you could do with the Voiceflow API. Here's a gallery of some of the integrations built by the team:
Webchat
Facebook Messenger + Telegram
Webchat
Slack
Customization
Custom Actions
Maybe in your use case for the API, you want a response where it charges the user's credit card, or navigates to different part of the website. You're not limited by what's available on Voiceflow, because with custom actions you can do something like this:
Custom NLP
If you want to use your own Natural Language Processing, this can easily done by specifying the request type as an intent
instead of text
. This will prompt Voiceflow to skip our NLP.
Stateful and Stateless
The Voiceflow API comes in two different flavors: stateful and stateless. It all has to do with user's state
- so far in this article, we've been referring to the stateful API, which is the easier concept to work with. State
refers to information about the conversation beyond the request that the user just gave - like what block on what flow they are on, what their variables are, and more metadata.
You'll see on the stateful API it includes a userID
in the URL, while this is absent on the stateless.
- With the stateful API, the
state
is saved on Voiceflow, so we'll always know whatuser 1
has done so far in their conversation, and you don't have to provide it in the API call. - The stateless API is very similar to the stateful API - with one difference:
- Instead of passing
userID
in the path parameters, the currentstate
of the user is passed in each request and a newstate
is sent back in every response. - The same request will always produce the same response.
- This API works by passing
state
back and forth, and Voiceflow will never store user session data in the process. The stateless API doesn't know who it is talking to. If you don't pass in astate
it will assume you are at the beginning of the flow.
- Instead of passing
Here's a quick analogy.
- The stateful API is like having a normal conversation, It knows it's talking to
user 1
, so when they ask a request it will give the appropriate response based on all the prior context, and what it knows aboutuser 1
. - The stateless API is like talking to someone with amnesia - it doesn't know or care who exactly it is talking to. Every time
user 1
says something, they will also give a sheet of paper about themselves and all the previous context. The API listens to whatuser 1
says and reads the sheet of paper, then responses and hands back an updated sheet of paper with this most recent interaction included. (Don't worry about the API's mental state - it reads and does everything instantaneously in a fraction of a second)
The stateful API just happens to have this sheet of paper (the state
) in its head all the time because it keeps track of who it is talking to.