André is one of our Project Owners and thereby the link between customers and development teams. He talks about his take on the developments around OpenAI’s GPT-3 and why the technology could help us make human-machine interaction more personal and effective.
The Underlying Challenge
Disclaimer: I am not an AI/ML engineer, rather a doe-eyed user caught in the headlights of progress.
In the search for meaningful conversation between humans and machines, we are at crossroads. It turns out that building a context-sensitive, sentiment-aware conversational agent that can execute tasks in the real world isn’t that easy. I think that the road we should eventually take comes down to asking ourselves a simple question: do I feel lucky, well do you, punk? Do you feel lucky enough to let a branded company agent, trained on parts of the open internet, converse with your customers? In a lot of ways, it comes down to how detailed we want the conversation to be without overwhelming the user, and also limit the risk of spewing bullshit when asked a simple question. Take my virtual hand and join me in this journey to look at what we can do with conversational agents today, what else is in the market, and how we can innovate with access to the right tools.
Toto, I Have the Feeling We’re Not in Kansas Anymore…
The first road takes us into chatbot country where we travel with the safety and predictability of a scripted conversation between man and machine. Although this gives our customers the confidence to turn loose a branded representative into the world, it comes with its own set of drawbacks. We’ve certainly built our fair share of chatbots and will testify to the fact that they can be high maintenance, and also don’t always fit the business model perfectly. Don’t get me wrong, chatbots are great! They solve a very real problem of easing the workload that first-line support is so inundated with, which could lead to some pretty big savings in both time and money since you know, chatbots don’t need food or sleep. Or unions.
So, What’s Behind Door Number 2?
Oh, I said we’re at crossroads. The second road of which takes us through the valley of virtual assistants. In this valley, all virtual assistants are equal, but some are more equal than others. At the starting blocks, we have the usual suspects, Google Assistant, Siri, and Alexa to name a few high rollers. One of the big differences to chatbots is the ability to actively assist users in everyday tasks like set reminders, manage calendars and tell pretty dope yo mama jokes. The lack of a script means that the virtual assistant can eventually just make suggestions for what you can do next but has no concrete answer to certain questions which can be extremely annoying. The interaction already feels a lot more natural than with a chatbot but we’re still a long way off when it comes to meaningful conversations. We’re close, but not quite there.
How Can We Innovate in Such a Packed Space?
The elusive road less traveled of course. What if we could combine the control offered by chatbots, with the natural interaction from virtual assistants? First of all, we would need to come up with specific use cases for this. It’s nice to experiment and innovate, but it’s also nice to go to bed with a full stomach. We need to identify use cases that are not only new but also billable eventually. Bear with me for this one. It might sound like a bit of a reach: Not everyone is as smart as you. You feel very comfortable with any new tech and don’t RTFM like ever. You probably just throw it away altogether. What if your mom (lol) wanted to use a new super complicated coffee machine to make her super complicated coffee order at home? She would fiddle around with the buttons, call you, you would ignore the call because who even calls anymore right? She gets really mad and just settles for a standard black coffee.
How could this be improved? She could have asked for a better child, or there would be an intelligent assistant ready to answer her every question, not about the technical workings of the machine but how to make her favorite beverage with it in a way she would understand.
But wait, isn’t that what Google Assistant does already? Yes, it is, that’s why we need to take this to the next level. We could start thinking about the following:
- Context sensitivity and detecting what kind of user or persona is asking the question. This will be the difference between overwhelming your saint of a mom with technical detail about coffee machines and simply giving her the information she asked for in the first place.
- Technical competence. We cannot simply search the internet on the fly for how to brew the coffee in question because we would end up with nonsense most of the time, not to mention that there is only a small chance she would find details about that specific machine.
- Constraints – we simply cannot handle the raw power of such an assistant. It will make links in its network that simply don’t exist IRL and provide insane answers to even the basic questions. The plan would be to build a generic infrastructure around it to ensure we understand verbal inputs as they are intended, and ensure the output is relevant.
That could lead us to use something like GPT-3, feed it with the coffee manual as input, use our generic constraints to ensure the responses stay relevant and deliver great responses. But what else could be possible? Glad you asked. Here are some things we came up with that we wanted to try out as soon as we have access to GPT-3:
- We make a lot of offers for innovation projects, wouldn’t it be nice to have a smart assistantto summarize complex customer requirements?
- Make development great again by using our smart assistant to comment and document our code leaving more time for the engineer to write the actual code
- AI-powered Commandline – have you ever been frustrated nearly to death by not knowing the command and spending 5 minutes searching for it online? Why not use an AI assistant to tell you what you think you need? Supply brief keywords and it will suggest the correct answer. Need to download something? Why not wget aka. wget -r --tries=10 http://fly.srk.fer.hr/ -o log?
Our innovation is the limit to what we can do, and at Motius we have limitless innovation. Now, all we need is full access to GPT-3.