On a rainy Tuesday in San Francisco, Apple executives took the stage in a crowded auditorium to unveil the fifth-generation iPhone. The phone, which looked identical to the previous version, had a new feature that the audience was soon buzzing about: Siri, a virtual assistant.
Scott Forstall, then Apple’s head of software, pushed an iPhone button to summon Siri and prodded it with questions. At his request, Siri checked the time in Paris (“8:16 p.m.,” Siri replied), defined the word “mitosis” (“Cell division in which the nucleus divides into nuclei containing the same number of chromosomes,” it said) and pulled up a list of 14 highly rated Greek restaurants, five of them in Palo Alto, Calif.
“I’ve been in the A.I. field for a long time, and this still blows me away,” Mr. Forstall said.
That was 12 years ago. Since then, people have been far from blown away by Siri and competing assistants that are powered by artificial intelligence, like Amazon’s Alexa and Google Assistant. The technology has largely remained stagnant, and the talking assistants have become the butt of jokes, including in a 2018 “Saturday Night Live” sketch featuring a smart speaker for seniors.
The tech world is now gushing over a different kind of virtual assistant: chatbots. These A.I.-powered bots, such as ChatGPT and the new ChatGPT Plus from the San Francisco company OpenAI, can improvise answers to questions typed into a chat box with alacrity. People have used ChatGPT to handle complex tasks like coding software, drafting business proposals and writing fiction.
And ChatGPT, which uses A.I. to guess what word comes next, is rapidly improving. A few months ago, it couldn’t write a proper haiku; now it can do so with gusto. On Tuesday, OpenAI unveiled its next-generation A.I. engine, GPT-4, which powers ChatGPT.
The excitement around chatbots illustrates how Siri, Alexa and other voice assistants — which once elicited similar enthusiasm — have squandered their lead in the A.I. race.
Over the past decade, the products hit roadblocks. Siri ran into technological hurdles, including clunky code that took weeks to update with basic features, said John Burkey, a former Apple engineer who worked on the assistant. Amazon and Google miscalculated how the voice assistants would be used, leading them to invest in areas with the technology that rarely paid off, former employees said. When those experiments failed, enthusiasm for the technology waned at the companies, they said.
Voice assistants are “dumb as a rock,” Satya Nadella, Microsoft’s chief executive, said in an interview this month with The Financial Times, declaring that newer A.I. would lead the way. Microsoft has worked closely with OpenAI, investing $13 billion in the start-up and incorporating its technology into the Bing search engine, as well as other products.
Apple declined to comment on Siri. Google said it was committed to providing a great virtual assistant to help people on their phones and inside their homes and cars; the company is separately testing a chatbot called Bard. Amazon said that it saw a 30 percent increase in customer engagement globally with Alexa in the last year and that it was optimistic about its mission to build world-class A.I.
The assistants and the chatbots are based on different flavors of A.I. Chatbots are powered by what are known as large language models, which are systems trained to recognize and generate text based on enormous data sets scraped off the web. They can then suggest words to complete a sentence.
In contrast, Siri, Alexa and Google Assistant are essentially what are known as command-and-control systems. These can understand a finite list of questions and requests like “What’s the weather in New York City?” or “Turn on the bedroom lights.” If a user asks the virtual assistant to do something that is not in its code, the bot simply says it can’t help.
Siri also had a cumbersome design that made it time-consuming to add new features, said Mr. Burkey, who was given the job of improving Siri in 2014. Siri’s database contains a gigantic list of words, including the names of musical artists and locations like restaurants, in nearly two dozen languages.
That made it “one big snowball,” he said. If someone wanted to add a word to Siri’s database, he added, “it goes in one big pile.”
So seemingly simple updates, like adding some new phrases to the data set, would require rebuilding the entire database, which could take up to six weeks, Mr. Burkey said. Adding more complex features like new search tools could take nearly a year. That meant there was no path for Siri to become a creative assistant like ChatGPT, he said.
Alexa and Google Assistant relied on technology similar to Siri’s, but the companies struggled to generate meaningful revenue with the assistants, former managers at Amazon and Google said. (In contrast, Apple successfully used Siri to entice buyers to its iPhones.)
After Amazon released the Echo, a smart speaker powered by Alexa, in 2014, the company hoped the product would help it increase sales in its online store by enabling consumers to talk to Alexa to place orders, said a former Amazon leader involved with Alexa. But while people had fun playing with Alexa’s ability to answer weather prompts and set alarms, few asked Alexa to order items, he added.
Amazon may have overinvested in making new kinds of hardware, like now-discontinued alarm clocks and microwaves that worked with Alexa, which sold at or below cost, the former executive said.
The company also underinvested in creating an ecosystem for people to easily expand Alexa’s abilities, in the way that Apple had done with its App Store, which helped stoke interest in the iPhone, the person said. While Amazon offered a “skills” store to make Alexa control third-party accessories like light switches, it was difficult for people to find and set up skills for the speakers — unlike the friction-free experience of downloading mobile apps from app stores.
“We never had that App Store moment for the assistants,” said Carolina Milanesi, a consumer technology analyst for the research firm Creative Strategies who was a consultant for Amazon.
Late last year, the Amazon division working on Alexa was a major target of the company’s 18,000 layoffs, and a number of top Alexa executives have left the company.
Kinley Pearsall, an Amazon spokeswoman, said Alexa was much more than a voice assistant, and “we’re as optimistic about that mission as ever.”
Amazon’s misfires with Alexa may have led Google astray, said a former manager who worked on Google Assistant. Google engineers spent years experimenting with its assistant to mimic what Alexa could do, including designing smart speakers and voice-controlled tablet screens to control home accessories like thermostats and light switches. The company later integrated ads into those home products, which did not become a major source of revenue.
Over time, Google realized that most people used the voice assistant only for a limited number of simple tasks, such as starting timers and playing music, the former manager said. In 2020, when Prabhakar Raghavan, a Google executive, took over Google Assistant, his group refocused the virtual companion as a marquee feature for Android smartphones.
In January, when Google’s parent company laid off 12,000 employees, the team working on operating systems for home devices lost 16 percent of its engineers.
Many of the big tech companies are now racing to come up with responses to ChatGPT. At Apple’s headquarters last month, the company held its annual A.I. summit, an internal event for employees to learn about its large language model and other A.I. tools, two people who were briefed on the program said. Many engineers, including members of the Siri team, have been testing language-generating concepts every week, the people said.
On Tuesday, Google also said it would soon release generative A.I. tools to help businesses, governments and software developers build applications with embedded chatbots, and incorporate the underlying technology into their systems.
In the future, the technologies of chatbots and voice assistants will converge, A.I. experts said. That means people will be able to control chatbots with speech, and those who use Apple, Amazon and Google products will be able to ask the virtual assistants to help them with their jobs, not just tasks like checking the weather.
“These products never worked in the past because we never had human-level dialogue capabilities,” said Aravind Srinivas, a founder of Perplexity, an A.I. start-up that offers a chatbot-powered search engine. “Now we do.”
Cade Metz contributed reporting.