It was April of 2015 and I had just returned to a burning summer in India from my startup stint in London. I wasn’t happy with the heat, but I’ve never been good with adjusting to climate change. I’m spoilt by the California weather and my move to the chilly streets of Shoreditch the year before wasn’t easy either. But weather had little to do with my decision to sweat in the humidity of Mumbai.
I wanted to see how the burst of smartphone adoption had impacted the lives of those traditionally used to a life without dependency on technology. There were quite a few trends that stuck out, but one in particular. The neck-breaking growth of messaging, especially WhatsApp. It was this realisation that would drive my contribution to Haptik, a personal assistance service over chat.
But you’re probably not here to read about my journey. So, let’s come to the real reason why you’re here.
Chatbots are being touted as the future of apps, but there have been great examples of how and why they will never reach this destiny.
When Sundar Pichai launched Allo as the chat based Google Assistant at Google IO, he said, “I believe we are at a seminal moment. We as Google have evolved significantly over the past 10yrs, and we believe that we’re poised to take a big leap forward in the next 10….We truly want to take the next step and be more assistive for our users. So today, we’re announcing the Google assistant.”
But Google was already late to the party!
Satya Nadella, CEO at Microsoft, has held his opinion that AI-assisted “agents” like Cortana are going to change how we use the web.
David Marcus, Lead of Facebook Messenger, launched Facebook M last year saying, “It’s an exciting step towards enabling people on Messenger to get things done across a variety of things, so they can get more time to focus on what’s important in their lives.”
Earlier this year, Dag Kittlaus, the man behind Siri, launched Viv, almost as a Siri 2.0 with all the capabilities of an assistant. His demo video probably surfaced the computer screen of every chatbot enthusiast out there.
But not everyone is a believer.
There are counter arguments on the success and feasibility of chat screens and conversational commerce. Connie Chan, Partner at a16z had a long Twitter piece about why conversational commerce is not the way forward. The concept of walled gardens showcases why chat platforms don’t really solve the problem of apps. Dan Grover articulates what many want to say with regard to bots, design and why bots may not replace apps.
Imagining the future of apps and sharing novel ideas on the subject has became a trend amongst strong opinionated (smart) minds of the human race. In these tides of strong opinions, Indian companies are not well represented. My experience at Haptik helps me contribute with insights into what I believe could be the future of our interaction with mobile phones.
But why are we talking about ‘chatbots’?
To answer this question we must look at the evolution of the preferred way of “getting things done”.
People engage in 100–200 transactions everyday. Right from sending text messages, replying to emails, setting reminders and placing lunch orders the list goes on. We love transacting. Maybe because it makes us feel productive. So when Jobs put a smartphone in our hands, that was it made us feel powerful and we naturally transitioned from old/dumb phones.
Then came the App store and it stimulated our fascination of trying new interfaces to get things done. Before we knew it, our urge to experiment had us overloaded with apps. Today there are 4.1M apps on both the Google & App Store combined.
But this is not just a phone (hardware) memory problem. We actually capped out on human attention span.
Human memory is weak and requires frequent activation. To avoid being lost in the crowd, the initial idea was to create “hooks” for companies.
Hooks are what people instantly associate with a company/brand when they hear the name or see the logo
Then we started capping out on these hooks and companies started fighting for domination. Growing ambitions led to a sort of convergence and a considerable overlap in the things that one app could get done.
As an effort to stay in the lead, startups moved with the strategy of making their services accessible from within other apps. They started assigning separate teams to build and maintain APIs (and SDKs in some cases) that other apps can consume to include their experience within new interfaces.
So “App-A” could also provide the features and services of “App-B” by using their API. Interestingly, “App-B” could do the same with “App-A”. Well, you guessed it. Now App-A or App-B are practically competing with each other instead of strengthening their individual value propositions.
These ambitions have given a whole new direction to the trajectory of software technology. A trajectory towards the non-existence of independent apps.
Apps in 2016 seem less like companies and more like a digital representation of tasks/transactions that a user engages in on a daily/weekly basis. We’re moving away from hooks of a company to an abstracted mental mapping of their value proposition.
This is where ChatBots come in
When it was clear that the solution was more than just storage space, companies saw a massive opportunity. The actual interface and underlying means of completing these tasks (apps) was being questioned. WeChat started experimenting with something powerful in their proprietary Chinese market. They’ve actually done a beautiful job of taking over this herculean market in totality. I can actually say that WeChat is an innate part of the Chinese daily life. Their approach is well drafted in this blogpost by Dan Grover.
There is lot to learn about human expectations from technology in his post. We must unlock the human emotion of feeling relaxed and aid the creation of free mental space time. This is subjective to people based on their environments. For instance, access to “correct” train schedules is an exceptionally important problem in countries like India v/s Japan where trains are almost always on time.
I was actually excited when they (WeChat) did their big launch in India. Sadly, it wasn’t as powerful as it is for people in China.
I tried searching for a restaurant here in Mumbai, India called ‘Delicacy of China’ on WeChat but did not receive any relevant results. It didn’t even work for ‘Techcrunch’.
The western world, along with parts of Asia and Europe still need a culturally relevant solution to this problem. At this intersection of looking at apps as tasks and the need for having a culturally neutral interface to get things done, the discussion of conversational commerce begins. It’s important to note that the chat screen is only one of the implementations. This should be looked at as an opportunity to create a universal interface into which these tasks (apps) converge.
But why the chat screen?
For an interface to be truly revolutionary, the most critical pre-requisite is adoption. People need to be comfortable and willing to use it actively. The gravity defying growth of WeChat, WhatsApp and Facebook Messenger made the world (those who were looking) realize how chat was a natural user interface. With a message sending bar, profile photo and online status the affordances of a chat screen enable an almost flat learning curve. Not only for the technology friendly, chat is commercially acceptable by the long tail of users. Messaging works well on low-internet connectivity and thats critical to become relevant for those yet to witness smartphones and the internet.
The next wave of 3B smartphone users will have a natural transition to conversational commerce because of a flat learning curve. It’s just like the SMS interface they’ve always used, but capable of much more. This makes chat an extremely lucrative interface for the future.
But in the words of David Marcus, “It’s the first day of a new era.” Remember what we thought about companies transitioning from their traditional website to a mobile-first experience.
In the startup world, the party is always where the people are. Chat became destined to be more than just speaking to your friends. It was going to become a place to provide (all types of) services. Everything from games to legal advice is now available through a chat-bot. I was once building a list of all the chat/Ai powered text based services. After 53 companies, I stopped counting. But assuming you already know all this, let’s cut to the chase. With services come experiences and transactions, both of which require standardization.
If you go 10,000 ft above, you can see all this noise split up into 3 segments:
- Platform: Companies that simply want to own the layer on top of which any company/player can build experiences.
- Player: Companies that want to create smart, predictive and inclusive experiences by taking end-end ownership of a few use cases.
- Aggregator: Companies building an interface that is an aggregator of other apps (themselves aggregators in most cases) and fulfill by redirection.
Platforms (Eg- Facebook Messenger/WeChat/Viv):
- Builds the underlying chat layer
- Provides an Open API that allows companies and enthusiasts to build chat bots
- Provides the elements that can be used for building consistent experiences
- Has an existing mass audience using the product everyday
- Building a bot is fairly straight-forward after reading the documentation
- Chat with your friends & companies within one interface
- It’s a massive menu like the app store and discovery is a problem
- Overload of messaging (both friends and companies/transactions in one view)
- Quality of experience depends on the ability of a company to build a bot
Players (Eg- Haptik / Operator / Allo / ChatBot for X):
- Builds the entire experience (chat + elements + payments)
- Creates a layer of intelligence to provide an inclusive and contextual experience
- Usually selects a few use cases for deeper focus
- Optimized for best experiences with focus on few use cases
- Big bets on Machine Learning and NLP (Natural Language Processing) to have predictive, smart and semantic conversations
- Can learn about a user’s preferences and increase offerings over time
- Limited Use Cases
- Build user base and distribution channels from ground up
- Dependence on partner co-operation
Aggregator (Justdial/HelpChat/SnapDeal Etc.):
- One app for a lot of use cases
- Create a browser like view that redirects to partner sites for completion
- If chat exists, it remains as a layer for advanced support
- Download one app instead of many (save phone memory)
- Comparison between service providers
- A simple DIY interface where there is no need to speak to anyone
- Redirection to partner apps/websites to complete transactions
- Lack of consistency in experience of discovery and completion
- Requires you to get it done yourself, instead of asking someone to get it done
- Difficult to give personalized experience
Simply put, you’re either telling someone to get things done for you, or getting them done yourself.
Haptik’s Approach and Artificial Intelligence
When things started blowing out of proportion, Haptik had to sync up on the big bets. We used our first-movers advantage to make some data-driven decisions.
The conversation that eventually led to our decision started out with Aakrit (CEO) and I having breakfast one morning. The (food) menu was a never ending list of things and it was impossible for me to decide my order. It struck me how similar this menu was to our app/play store. The App store is a massive menu of things I can do when I’m busy or bored.
Think about when you open up your phone. You’re either busy and want to get something done (call, text, reply, post, upload, share) or you’re bored and you want to entertain/educate/enhance yourself (games, photos, discovery, reading). The apps on your phone and more importantly your notifications say so much about you. They are the choices you made when you were given the menu.
But, menus are confusing and frustrating when you’re looking for something specific. So, we came up with a simple theory:
If you control the items on a menu, you control the choices one can make. If you control the choices, you can predict decisions. Knowing how people invest their time and which choices they make when given a fixed menu defines their mental (space) investments. This is the meta-data of your so called “interests” that can be used to provide hyper-contextual experiences.
This theory led to an experiment that eventually refined the services that Haptik currently focuses on.
But why not become a “platform” that can become the future App Store?
My friend and former Haptik team member, Raveesh Bhalla has a nice Batman-Inspired quote, “You either die an app, or live long enough to become a platform.” It sounds and feels amazing to fantasize about becoming a platform on which companies from around the world build tools and bots. Sadly, the entry barriers to create a global impact are more than just an idea. Here are some prerequisites, especially for chat:
- An existing highly engaged large user base (10M+ DAU)
- An intrinsic reason/use case that makes the platform provider essential on your phone
- Must be available and relevant to a global audience
- Has solved the paradoxical catch-22 problem of how the users come before the service providers
A good example is Facebook or WeChat. They’re able to now provide more than just social networking because the foundations of peer-to-peer sharing (text/photos) make them essential on every phone. Once they become a part of everyday lives for a critical mass of users, they can unlock value by providing more than just social value.
Becoming a platform is a change of state or transition instead of an idea to startup with. You can strategize to increase the chances of your company becoming a platform. But, it’s not a weekend project that you can launch as a startup. Haptik among many other companies has the ambition of one day becoming a platform. But we could not have done justice to it if we chose that as our direction yesterday.
This decision was easy. But to exist independently and truly provide the service of a personal assistant was tricky. We had to know a lot about each individual user to provide magical experiences. We could build a simple bot that asked you bunch of questions and tried to decipher answers but let’s face it, chat bots suck (right now).
Here are few of my experiences while testing out the (supposedly) AI-powered chatbots on Luka & Facebook M.
- Most of them are “not really” intelligent
- They only look cool in demos
- Difficult to trust that a bot really understands you
- Takes too much time for back and forth
- Lacks the ability to have personalized/contextual conversations
Nicely summarized in this piece by TC.
A good personal assistant should be intelligent and have contextual understanding. It must be capable of having a semantic two-way conversation; a dialogue.
Making Chatbots Smarter
Companies like Google and Facebook have the luxury of a billion users sharing photos and messages on their platform. Without access to such kind of user data, new companies must be creative and use smart ways.
A lot can be learned from the number of notifications a user gets from a particular person/company. The context of those notifications (which apps they come from, what time, etc..) and how a user interacts with them represents the information of what matters to them.
- Which apps do users have on their phone?
- Is there a trend of launching an app with respect to time of day and location?
- Which apps do users use the most?
- How many times do they check their phone?
- Do they have a lot of meetings or spend most of their time on Instagram?
- How much data do they use on the phone?
These questions and more provide the insights that smart companies can use to create contextually relevant experiences.
What works for Haptik is human-enabled AI (Artificial Intelligence). Think of it this way, when you send a message on Haptik our bot takes the first shot at trying to decipher what you mean. If unable to meet 99% accuracy, the bot will break (in the background) and ask a real human to help resolve this query. The request is then matched to a relevant human (assistant) who is chosen based on age, geography and expertise to answer. So a user asking about recommendations for good Italian food in New Delhi is more likely to be answered by the bot. But a request for pet-friendly restaurants in Mumbai which have great veggies burgers will be taken up by an assistant in Mumbai . This assistant while being hired would have been categorized as a foodie and loves speaking to users about the best places to eat.
As the assistant answers, the bot learns and saves context for future users from that demography. It also learns the meaning of what was being asked for and relates it to one of our pre-defined task categories. It’s extremely difficult for a bot to understand and respond to users from varied cultural backgrounds. This is where actual humans help our bot learn about these interactions. While this approach takes time, and we’re in the early stages, the beauty will be in scale.
Our bot will soon be able to understand you as a person based on your previous chats and respond in chat lingo that you’re used to. So when you say “Ur taking to long” or “You takin so long.” The bot is able to interpret your impatience and come back with a status update on your request.
You set a daily wake up reminder on Haptik for 9am. Result: We know you’re up at 9am.
When you launch the app for this reminder, you’re most likely at your home. Result: We know the area that you live in.
You ask us about best routes/traffic between 2 points. Result: We know the areas you usually commute between.
If you’ve ordered lunch to a specific address few times it’s most likely your workplace. Result: We know the area in which you work.
With subtle information, we’re able to do some really cool stuff. We can arrange a phone call if you don’t snooze the wake up reminder. Notify you 30mins after you wake up with the fastest route to work or the option to book a cab. Also, allow you to pre-book your meals while en-route to work and maybe get a coffee delivered to your table before you get there.
But thats not all.
The next step in making chatbots smarter is….well, other chatbots.
We build and deploy chatbots that have learnt how users frame sentences and run tests on the main consumer facing chatbot. This is a continuous process that runs 24 x 7 to teach the bot how similar users would (most likely) ask about different use cases.
This is a result of something we call “Genius Mode” which learns about users and predicts next moves.
Where is all this going?
Every company has a subjective opinion on what they think the future of their industry is. Everyone optimizes to find their spot in the grand scheme of things. In my subjective opinion, I think all our (apps) efforts will converge into a singular objective of reducing the mental space and time required to get things done. Mobile distribution is capping out, and it’s called for some urgent and important paradigm shifts.
As innovators and creators we must not only optimize for engagement on our product but also share the responsibility of reducing the mental load on the users of our products. Help humans create and rid themselves from being notification slaves.
The good news is that natural selection is guiding us in this direction. Human constraints of memory force the consolidation of apps into the limited space available on the homepage of sorts in the brain.
So, if you’re building for mobile users, here are the few things you should prepare for over the next 5 years:
- Mobile Footprint — Getting Smartphones in the hand of the next 3B.
- Flat Learning Curves — Smartphones will get the internet into the hands of people who have not learned mobile interfaces before (long tail). User experiences and interfaces will have to be re-imagined to be user friendly for the new users.
- Internet Accessibility — Reliable high speed internet available for all. Think Oxygen for future generations.
- Convergence of Apps into Tasks — Standalone apps that perform a fixed function and take up memory on your phone will not exist. Things will converge into something more centralized. An interface for this is in question, chat is only one of them.
- Always be Personalising — Machine Learning has just about reached a point at which every mobile phone user is expecting a custom personalized experience. (Eg: Timing notifications based on urgency and relevance to users.)
- SmartPhones and SmartExperiences — Understanding of preferences and contextual presentation of data will be expected from every mobile experience. (Eg: Predicting upcoming tasks and contextual reminders)
The winner must imagine and execute on the interface inside which 7B users would like to get things done. It’s surely going to be a fun experimental ride.
Thanking Alisha and Vaibhav for proof-reading, Shreya for all the creatives and Laksh for traction. This would not be possible without you.