OpenAI's GPT-3: a language model that doesn't need finetuning

DT #42 — June 21, 2020

OpenAI announced GPT-3, the next generation of its language model. As we’re used to by now, it’s another order of magnitude bigger than previous models, at 175 billion parameters—compared to 1.5 billion for GPT-2 and 17 billion for Microsoft’s Turing NLG (DT #33). It’s not the model’s size that’s interesting, though, but what this enables. From the abstract of the 74-page paper by Brown et al. (2020) detailing GPT-3:

Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches. … For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model.

This is super cool! Where GPT-2 could only complete a passage from a given input in a natural-sounding way, GPT-3 can now do several tasks just from being shown examples. Instead of fine-tuning the model for specific tasks like translation, question-answering, or generating podcast episode titles that do not exist (👀), the model can do everything out of the box. For example, if you feed it several questions and answers prefixed with “Q:” and “A:” respectively, followed by a new question and “A:”, it’ll continue the passage by answering the question—without ever having to update its weights! Other example include parsing unstructured text data into tables, improving English-language text, and even turning natural language into Bash terminal commands (but can it do git?).

OpenAI rolled out its previous model in stages, starting with a 117-million parameter version (“117M”) in February 2019 (DT #8), followed by 345M in May of that year (DT #13), 774M in September with a six-month follow up blog post (DT #22), and finally the full 1.5-billion parameter version in November (DT #27). The lab is doing the same for GPT-3, which is also the first model that it’s making commercially available in the form of an API. Just a few vetted organizations have had access to the API so far. Ashlee Vance for Bloomberg:

To date, Casetext has been using the technology to improve its legal research search service, MessageBird has tapped it for customer service, and education software maker Quizlet has used it to make study materials.

Janelle Shane als has access to GPT-3, and she has used the API to make some “spookily good Twitter bots” on her AI Weirdness blog.

I’m glad OpenAI staging the release of their API this way again, since valid criticism has already started popping up: Anima Anandkumar pointed out on Twitter that the GPT-2 has “produced shockingly racist and sexist paragraphs without any cherry picking.” (Also see this follow-up discussion with OpenAI policy director Jack Clark.) These type of bias problems have to be worked out before the model can responsibly be released beyond a few trusted partners, which OpenAI CEO Sam Altman also acknowledged this in Vance’s piece:

As time goes on, more organizations will gain access, and then the API will be public. “I don’t know exactly how long that will take,” Altman said. “We would rather be on the too-slow than the too-fast side. We will mistakes here, and we will learn.”

As the OpenAI API gets released more broadly and integrated into more products, I’ll keep following its progress.

Dynamically Typed

OpenAI's GPT-3: a language model that doesn't need finetuning

ML Research

ML Research