REALM

DT #47 — August 30, 2020

Although increasingly enormous do-it-all language models like T5 and GPT-3 (DT #42, #44) have been getting a lot of attention (haha) lately, smaller and more parameter-efficient models are still improving a lot as well. A recent interesting one is REALM by Guu et al. (2020) at Google AI, which, unlike these larger models, separates the encoding of language from the encoding of knowledge. Instead of implicitly storing information about the world in the language model’s weights, it introduces a neural retriever that learns to find relevant snippets of text from Wikipedia to be fed into the language model as context alongside the original query. As a result, it achieves a score of 40.4 on Natural Questions with just 300 million parameters, compared to T5’s score of 36.6 with 11 billion parameters—10% better results at 35x fewer parameters.

ML Research

This section of Dynamically Typed covers recent models, datasets, and tools for machine learning research.

Join 325+ others and subscribe to get DT in your inbox every second Sunday — 76 issues and counting!

Or check out recent DT issues first:

DT #76: Dynamically Typed Hiatus

DT #75: OpenAI's book summaries for the alignment problem, Translatotron 2, and AI-generated movie posters

DT #74: Apple's privacy-focused facial recognition, DeepMind's multimodal Perceiver IO, and sea ice forecasting with IceNet