The 3 phases of independent voice AI Agents

There’s been a lot of buzz around independent voice assistants.

From large funding rounds accompanying their development to increased interest and press, independent assistants are on the rise. I’m a firm believer in independent voice assistants and wanted to weigh in with my opinions on where and when they make sense.

Voice assistants (VAs) represent a new platform shift in computing. VAs are the first platform built around being an omnipresent operating system capable of being contextually proactive to user needs. Further, the interface VAs employ, a voice user interface, represents the first time in computing where computers need to learn how to speak to us, versus us to them.

The leading voice assistant platforms we all know, such as Amazon Alexa and Google Assistant, are attempting to build General Voice Assistants (GVAs) which are meant to act as a proactive operating systems for everyday life. As GVAs have exploded in popularity selling hundreds of millions of units, we’ve begun to to see an increase in interest, capabilities, and number of Independent Voice Assistants (IVAs) available. IVAs are built and managed by companies for specific purposes, such as  in-car assistance, lab assistance or content delivery.

With every new platform shift there’s rapid proliferation of new entrants in hardware and software to capitalize on the opportunity.

Think back to the hundreds of PC and mobile phone brands that popped up during the emergence of those platforms, only to disappear years later. Generally speaking, the rise and fall of these entrants during a platform shift follows a 3 stage cycle: expansion, duality ,and consolidation.

This 3 stage cycle of emergence, duality and consolidation is caused by the growing network effects generated by platform app ecosystems on the software side, and by economies of scale on the hardware side. These two forces eventually stem the creation of new entrants and lead to consolidation and consolidation in just a few major players.

Stage One: Expansion

The goal of every for-profit business is to sell more things, a goal which usually requires the best customer experience. In a truly free market, what’s best for business is what’s best for customers. If consumers truly and deeply cared about privacy, Facebook would be out of business. Consumers vote with their dollars, not words.

Voice assistants are no different and the assistant with the best customer experience wins. Today, independent assistants, in most cases, offer a superior customer experience and benefit both the customers and the brands developing them. This has led to an increase in the interest, demand, and resources available for IVA development.

GVA 3rd party apps are a non-factor

One of the major differences between GVAs and IVAs is the existence of 3rd party app ecosystems similar to what’s found on mobile. While there are thousands of 3rd party apps on GVAs like Alexa and Google Assistant, as well as a handful of standout apps and developers, these do not yet materially enhance or influence the experience of the average VA customer. This is largely due to poor discoverability and average app quality, both of which are hindering voice apps from becoming a standout feature of GVAs.

Less restrictions means better experiences

A main drawback for brands developing a GVA app is the restrictions, both technical and policy-based, each platform imposes. Because the 3rd party app developer does not own the platform, they are at the mercy of the decisions of the platform maker. This limits the innovative ideas that many brands have for their own voice app and can stifle the quality of the customer’s experience.

For consumers it’s less confusing having an assistant that acts more like a purpose built app because it’s easier to understand and guess functionality it can, and cannot handle. For example, when driving it’s easy to match an IVA to the assumed experience of a built car, such as give directions and control the stereo. With GVAs, it’s less clear what the assistant can do because it can do so much and services are separated amongst thousands of apps of varying quality. This leads to a more frustrating customer experience compared to a more streamlined IVA experience.

Owning the customer relationship and data

Corporate executives want to limit the amount of data they give to Amazon and Google as much as possible. Many businesses which once gleefully sold on Amazon’s marketplace have lost business to Amazon’s in-house brands which leveraged marketplace sales data to target and clone sellers. Many media companies feel similarly towards Google where they fear their information is being mined and served by Google increasingly without attribution. When developing for Alexa or Google Assistant, brands are once again giving control of the customer relationship to the platform makers. Through developing an IVA, brands can create a custom assistant persona to fully own the customer’s relationship with their brand, as well as their data.

Nascent economies of scale

Economies of scale refer to lowering unit costs in tandem with increasing unit production — the manufacturing equivalent of a bulk discount. As an example, it would cost a smaller company infinitely more dollars per unit to produce an iPhone than Apple which has hit scale (in both expertise and volume). Economies of scale matters to platform shifts because software needs hardware as a method of distribution. You could have the best OS in the world but if you don’t have reasonably priced hardware, it doesn’t matter.

When it comes to voice assistants, the only prerequisites, (and I’m grossly oversimplifying) are a microphone and a data connection — it’s how a FireTV stick turns my TV into an Alexa device. This low barrier-to-entry encourages expansion in the numbers of new entrants as companies are still able to compete on price and create their own hardware as a form of assistant distribution.

Leveraging brand for a privacy-focused IVA

Consumer privacy is one of the chief concerns of consumers when buying a GVA. This is where IVAs with stronger, privacy focused brands can take advantage and some marketshare.

I believe voice assistant privacy is largely a brand perception problem – not a technical or moral problem. People feel uncomfortable with the increasingly Orwellian depiction of the future enforced by trillion dollar “super companies”.

People aren’t scared of voice assistants; they’re scared of the companies who make them.

Chief example of this is the Facebook Portal which was slammed by consumers over privacy issues even though the technology is largely the same as other VAs. This creates a unique opportunity for reputable brands to compete with Alexa and Google for specific use cases through an IVA on the merits of “privacy”, which is really just brand purity. It’s safe to say we’d all be more comfortable with a BBC, Spotify or Ikea VA in our living room compared to a Facebook Portal.

This is a real short-term advantage IVAs will have over GVAs and it will continue to generate interest in IVAs. However, long term I do not believe privacy alone will be a big enough factor to merit the existence of many IVAs. After all, we’ve all been walking around with the ultimate listening devices in our pockets for a decade now - at least my VA has the courtesy to light up when it’s listening.

Today’s best-in-class voice experiences will be delivered through IVAs that are highly contextual and able to perfectly match a specific use case, such as being a lab assistant or controlling a car. When GVA 3rd party apps are a non-factor, there’s little opportunity cost for brands investing in an IVA over a GVA app given poor discoverability and platform risk. These factors paired with nascent economies of scale on the hardware side lead to an expansion of the market as new entrants are incentivized to launch IVAs.

Stage Two: Duality

What happens when GVAs begin to ship a small number of “killer apps” for their ecosystems, such as ordering a coffee during your morning commute through voice? This is when the duality stage kicks in — the stage where consumers will demand having both a GVA and IVA to provide the optimal experience.

Why only GVAs will have app ecosystems

Imagine you’re choosing between an iPhone and BlackBerry in 2015 — what are some of the primary reasons you’d choose an iPhone? I’d bet being able to use apps like UBER or Instagram would have been high on your list. When the iPhone first came out however, the appeal of the AppStore wasn’t as strong without killer apps. Consumers were judging the two devices purely on their hardware and cost, allowing BlackBerry to remain dominant for a few years. As the AppStore grew, every new app added additional value to the iPhone, and with that additional value came additional customers which further incentivized developers to keep making more apps for the iPhone over BlackBerry. This is a classic example of a network effect where every additional unit sold of a product increases the product value for all existing customers. Think Instagram with 10 users vs 100,000 users — one isn’t useful for you at all, and the other is tremendously useful — the only change is the size of the network, which increases the product value.

Ultimately, a platform like a smartphone or voice assistant is only as good as its app ecosystem, and subsequently its network effect.

When it comes to VAs as a platform, only GVAs will be able to build strong network effects through the creation of 3rd party app ecosystems. It’s unreasonable to imagine IVAs having the resources or time to build their own 3rd party app ecosystems with existing GVAs having such dominant head starts. The head start these platforms have is hard to overcome. App developers aim to maximize the investment on their time and will only develop for the platforms which yield the highest return, and that’s the leading GVAs who have the most app customers. Even giants like Microsoft couldn’t build a competitive app ecosystem for their smartphones once the flywheel of the leaders Apple and Google were in motion.

Having two assistants

As the GVAs begin to produce killer apps, such as voice commerce, consumers will increasingly rely on their GVAs to access these apps. IVAs will continue to play a role as tech restrictions will still be present, and the IVA will remain the best experience for its particular function. However, there will be a growing cognitive load for the customer as they have to shift back and forth between two assistants on the same hardware. I view this akin to when the iPhone was taking off and many professionals had 2 phones — an iPhone/Android for the app ecosystem, and a BlackBerry for specific work functions like email and conference calls. Duality is the awkward stage in the middle where consumers need both to have the best customer experience.

As GVA apps progress consumers will increasingly prefer using a central GVA that has the apps they need. IVAs will still play an important role but without robust app ecosystems of their own, they’ll need to be paired with a GVA to provide the best customer experience.

Stage Three: Consolidation

Consolidation starts with aligning independent goals in one place. The goal of a car company is to sell more cars, a music steaming company to sell more subscriptions, and a news company to reach more people, and sell more ads. None of these business outcomes requires an IVA and as GVAs grow in reach and prominence, evidence from past platform shifts suggests the best way for brands to achieve their goals will be to exist as 3rd party apps on GVAs. What brands want is always what consumers want, and that’s whom ultimately makes the final decision. Many big brands do not want to sell on the Amazon marketplace, but they still do because it sells more items than only having their own standalone store. Even BlackBerry’s BBM service which was core to their once dominance eventually became an iPhone app after succumbing to Apple’s AppStore network effect in an effort to keep the brand alive.

IVAs will continue to thrive when handling highly complex, security mandated, and specific use cases — examples that come to mind are hospitals, driving (paired with GVA), government offices, and pharmaceutical labs. As the GVA app ecosystems grow paired with an expansion of the functionalities available to GVA app developers, many IVAs without a specific consumer-benefitting purpose will provide an inferior customer experience. This is largely due to the cognitive load of having multiple assistants for similar purposes. If I gave you the choice of 2 phones each with 1 essential app, or 1 phone with those same apps but 40% slower to use, which would you choose? I’d bet most choose the single phone with slower apps because the aggregate customer experience is better, and will only continue to get better with time as GVA functionality grows. IVAs may be better than a single GVA 3rd party app, but they aren’t better enough to merit maintaining an entirely separate assistant, much like having a second phone. Although brands would love to have their own IVA to own the customer experience, relationship and data —  the network effect of an overwhelming GVA app ecosystem will push IVAs further into niche use cases.

Economies of scale kicks in

As hardware that provides the best customer experience, such as smart displays like the Google Nest Hub, get more complex, 3rd party entrants will increasingly be unable to compete on price for a similar device. The market leading GVAs will continue to press forward with economies of scale leaving smaller IVAs without competitive hardware for distribution of their assistants. Economies of scale paired with the network effects from GVA app ecosystems leads to market consolidation, where IVAs that aren’t highly specific or essential will be deprecated and replaced by 3rd party apps on GVAs.

In specific use cases IVAs will continue to thrive if they provide the best customer experience. However for the majority of users and use cases, the easiest, fastest, and most personalized experience will come from an interconnected GVA with millions of high quality 3rd party apps through effective discovery.

In Conclusion

Independent voice assistants create tremendous value for brands developing them and can create a differentiator in the market whilst GVAs are still finding their footing. Customers win with better, more contextual experiences and brands win through protecting their data and owning the customer relationship. We’ll continue to see expansion in the numbers of IVAs whilst GVA app ecosystems remain non-material for consumers, and the hardware barriers remain low enough where economies of scale is nascent.

With the emergence of better discovery and app experiences from the GVAs, we’ll see the superior customer experiences shift to a duality model where hardware will have both an IVA and GVA. This is the awkward stage in the middle where consumers have had two laptops, two phones, and two voice assistants.

As the GVA app ecosystems hit their stride and become an invaluable feature covering millions of use cases, IVAs serving non-complex functionality to general consumers such as music streaming or delivering the news will become increasingly redundant. Having multiple assistants with overlapping functionality creates cognitive load and will have consumers preferring the sole use of a GVA and its greater functionality. It’s through this final stage we’ll see the consolidation of thousands of IVAs into a few leading GVAs with millions of 3rd party apps. IVAs will continue to exist and thrive in use cases which require complex interactions, hardware integrations, or security restraints. We’ve seen similar cycles with PCs and mobile and I’d expect we’ll see a similar trend with voice assistants.

Side note on an alternate reality for voice assistants

If we weren’t living in an age of trillion dollar tech companies and vertical integration, I’d guess voice assistants would be more akin to browsers than operating systems. In this alternate reality, voice tech would be delivered through our web browsers and websites would have a conversational interface component structured through HTML tags. This model would allow for thousands of independent assistants, all serving unique niches without a “winner takes most” market like we have with today’s voice assistants. In this reality, perhaps Google focused on their assistant being the “smartest” experience whilst Alexa focused on commerce, and smaller assistants took up the mantles of privacy and better experiences for minorities. However this is just an alternate reality for now, and is likely to continue being the case.

While this does sound like a utopia for many VA enthusiasts, the major downside would have been the pace of innovation. The web is open and free but this freedom requires the adoption of shared standards, which dramatically slows the pace of innovation. Because today’s assistants are wholly owned by one entity, their evolution is dramatically faster.

Have comments, questions or thoughts? Email me at: braden@voiceflow.com


RECOMMENDED
square-image

How to create an Alexa skill with GPT-4 and Voiceflow

RECOMMENDED RESOURCES
No items found.