Step 1: Gather utterances and transcripts
The process for gathering utterances varies slightly between voice and chat agents. When building an interactive voice response (IVR) agent, we want to start with setting up an utterance capture on the company’s customer service line.
In practice, customers would be greeted with something like this when calling in for support:
“Hi, welcome to Voiceflow Bank. I’m learning how to understand humans. In a short sentence, please tell me why you’re calling today.”
How long you’ll want to run your utterance capture depends on call volume. A good rule of thumb is to keep it open for up to a month, or until you’ve captured at least 5,000 utterances.
If you’re building a chat agent, you’ll use your existing chat interface (if you have one) to gather data on first utterances mentioned to your agents. You can also set up a capture box where the user can write why they’re reaching out. This prompt might look something like this:
“In a short sentence, please tell me why you’re getting in contact today.”
The overall goal of this step is to gather as many use cases as possible. This is the data you’ll use to build out your intent structure for the rest of the agent.
Step 2: Build the intent model
Once you’ve captured enough data, you can start creating your intent model. This involves clustering semantically similar utterances into intents. From there, you’ll refine it further by annotating entities, creating points of disambiguation, splitting intents, and merging others together. Ideally, by the end of the process, your model should understand 80% of all your utterances, at an 80% confidence.
Depending on size, your initial model could take anywhere from one week to a month to build. For reference, it took me two weeks to complete the initial build of my last big model, which featured 100,000 utterances and 130 intents.
Step 3: Define the scope
There’s nothing worse than a two-week project that stretches out into a month. To avoid this, you’ll need to carefully map out the project scope in advance.
This starts with agreeing on an ideal outcome for each intent. Working with stakeholders, you can work backwards from each ideal intent to scope out the workload. For example, is the final destination for a certain query in an FAQ redirect? Is it a handover to an agent, or are there multiple integrations to consider?
Once you have an idea of the scope of each intent, you can then scope out the entire build. You can assign story points to each intent, like so:
- 5 points = 2 days
- 3 points = 1 day
- 2 points = half day
- 1 point = 2 hours
Story points create a level of abstraction and a quick translation of effort for each step, making the estimation process quicker and more focused on the complexity of the task than the minutiae of how long something will take. As a rule of thumb, no single task should take longer than two days. And if it does, break it down into sub-tasks.
Step 4: Plan your happy paths
Once you’ve built out your intent model and defined the scope of each intent, it’s time to build out your happy flows. These are basically a visual representation of your best-case scenarios, where your customer says what they want you to deliver to them in the most efficient way possible.
Of course, happiness is highly subjective, so you’ll want to get sign-off from your stakeholders before building out the long tail of your journeys. Here you’ll need to gather feedback on the design, and maybe even test out your flows with customers before jumping into the long tail of your design.
Step 5: Build out long tail design
After your happy paths are paved, you’ll need to consider the not-so-happy ones. Building out the long tail design of your agent means thinking through each edge case that could arise. That is, every error, unknown, or less-than-ideal outcome that could come up during a customer’s interaction with your virtual agent.
For instance, maybe you're asking a customer for their account number. The customer might respond with, “Where do I find my account number?” rather than providing the information you need. This is an edge case you’ll want to handle so you can get the user back on track after giving them that information.
Other things that you’ll want to focus on—which already will have been scoped out in Step 3—are things like:
- How you’ll use integrations
- What variables and entities could be used to skip customers through
- What thresholds to put in place before handing a customer over to an agent
Much like in the previous step, you’ll want your stakeholders to weigh in with their feedback before you seal off this step and move on.
Step 6: QA and launch
Before your agent is ready to interact with customers, you’ll want to send it over to QA. Ideally, a QA engineer will complete this step. If not, you’ll want to create test cases that ensure the agent and its flows work as expected. If anything goes awry, then you’ll need to fix any bugs (and retest) before launch.
Of course, there’s only so much you can test internally before your agent is released into the wild. Building conversational AI systems is an iterative process, so your goal before launching should be to QA each possible path you’ve planned out so far, and then build on them with real-world data.
Step 7: Business as usual (BAU) monitoring
The long tail design and QA steps will get your virtual agent to a “good enough to launch” state, but that doesn’t mean your work is done.
BAU monitoring is an ongoing requirement for virtual agents. By monitoring the way customers interact with it, you can report on performance and track any bugs that weren’t caught during QA.
This stage can also involve creating new intents and flows as they emerge in real life. Each new intent can be scoped separately, then sent to stakeholders to decide what should get built next.
Building an agent IRL takes time.
Now that we’ve reviewed each step to building a virtual agent, let’s look at how this process might play out in real life. Here’s an example of an IVR or simple chatbot build, with a few complex flows and agent transfer.
This example depicts about 90 days of effort between design development and project management. It’s worth noting that these things don’t always follow a typical waterfall model. There’s room to play here—for example, you could choose to build out integrations during the happy path design.
And while I’m positing that there’s a certain timeline you can assign to building out a virtual agent, we all know the work will continue for as long as the agent exists. Just like each person has their own preferred idioms, favorite responses, and new slang to add to their vocabulary over time, your agent will get better—and more distinctive—with each interaction and iteration. Just don’t let it call your customers n00bs. It’s not nice.