“What’s surprising about these large language models is how much they know about how the world works simply from reading all the stuff that they can find,” says Chris Manning, a professor at Stanford who specializes in AI and language.
But GPT and its ilk are essentially very talented statistical parrots. They learn how to re-create the patterns of words and grammar that are found in language. That means they can blurt out nonsense, wildly inaccurate facts, and hateful language scraped from the darker corners of the web.
Amnon Shashua, a professor of computer science at the Hebrew University of Jerusalem, is the cofounder of another startup building a bilingual AI model based on this approach. He knows a thing or two about commercializing AI, having sold his last company, Mobileye, which pioneered using AI to help cars spot things on the road, to Intel in 2017 for $15.3 billion.
Shashua’s new company, AI21, which came out of stealth last week, has developed an AI algorithm, called Jurassic-1, that demonstrates striking language skills in both English and Hebrew.
In demos, Jurassic-1 can generate paragraphs of text on a given subject, dream up catchy headlines for blog posts, write simple bits of computer code, and more. Shashua says the model is more sophisticated than GPT-3, and he believes that future versions of Jurassic may be able to build a kind of common-sense understanding of the world from the information it gathers.
Other efforts to re-create GPT-3 reflect the world’s—and the internet’s—diversity of languages. In April, researchers at Huawei, the Chinese tech giant, published details of a GPT-like Chinese language model called PanGu-alpha (written as PanGu-α). In May, Naver, a South Korean search giant, said it had developed its own language model, called HyperCLOVA, that “speaks” Korean.
Jie Tang, a professor at Tsinghua University, leads a team at the Beijing Academy of Artificial Intelligence that developed another Chinese language model called Wudao (meaning “enlightenment”) with help from government and industry.
The Wudao model is considerably larger than any other, meaning that its simulated neural network is spread across more cloud computers. Increasing the size of the neural network was key to making GPT-2 and -3 more capable. Wudao can also work with both images and text, and Tang has founded a company to commercialize it. “We believe that this can be a cornerstone of all AI,” Tang says.
Such enthusiasm seems warranted by the capabilities of these new AI programs, but the race to commercialize such language models may also move more quickly than efforts to add guardrails or limit misuses.
Perhaps the most pressing worry about AI language models is how they might be misused. Because the models can churn out convincing text on a subject, some people worry that they could easily be used to generate bogus reviews, spam, or fake news.
“I would be surprised if disinformation operators don’t at least invest serious energy experimenting with these models,” says Micah Musser, a research analyst at Georgetown University who has studied the potential for language models to spread misinformation.
Musser says research suggests that it won’t be possible to use AI to catch disinformation generated by AI. There’s unlikely to be enough information in a tweet for a machine to judge whether it was written by a machine.
More problematic kinds of bias may be lurking inside these gigantic language models, too. Research has shown that language models trained on Chinese internet content will reflect the censorship that shaped that content. The programs also inevitably capture and reproduce subtle and overt biases around race, gender, and age in the language they consume, including hateful statements and ideas.
Similarly, these big language models may fail in surprising or unexpected ways, adds Percy Liang, another computer science professor at Stanford and the lead researcher at a new center dedicated to studying the potential of powerful, general-purpose AI models like GPT-3.