Rich Sutton wrote The Bitter Lesson, an essay on the ineffectiveness of human knowledge in AI research. Sutton is a research scientist at DeepMind who, among other things, wrote the
reinforcement learning textbook that I used in a course this semester. He’s a big name in the field.
In his March 13th essay, he asserts:
The biggest lesson that can be read from 70 years of AI research is that general methods that leverage computation [like search and learning] are ultimately the most effective, and by a large margin. … We have to learn the bitter lesson that building in how we think we think does not work in the long run.
He cites examples from systems that beat humans at chess and Go, to current neural-based approaches to computer vision and natural language processing, to show that expertly crafted systems were all beat out by brute force search and learning algorithms once computers became fast enough. Sutton thinks we should therefore focus on general methods, and that explicitly modelling systems after how we think our mind works is not a good long-term strategy.
Sutton’s essay sparked a lot of debate in the community; I’ve highlighted a few of the most interesting responses to the essay below.
In a Twitter thread, Oxford’s Shimon Whiteson points out that state-of-the-art deep learning-based AI approaches all still incorporate human knowledge in how they’re structured: CNNs use convolutions and LSTMs use recurrence, both big “human knowledge” innovations. He continues:
So the history of AI is not the story of the failure to incorporate human knowledge. On the contrary, it is the story of the success of doing so, achieved through an entirely conventional research strategy: try many things and discard the 99% that fail.
The 1% that remain are as crucial to the success of modern AI as the massive computational resources on which it also relies.
Other responses include
A Better Lesson, where Rodney Brooks also asserts that the way humans wrangle increasing compute power is responsible for AI’s success; and
A Meta Lesson, where Andy Kitchen argues that success lies in the combination of these two. Finally, in
The Wrong Classroom, Katherine Bailey says that using all these well-defined problems as metrics for “AI” will naturally favor Sutton’s search- and learning-based systems, which may accomplish the given tasks, but do so
not by being intelligent but “
in virtue of having behaved intelligently:”
With computer chess, was the goal just to win at chess, or was chess chosen as a metric because it was believed that in order to master it you had to [think] the way humans did it? … Metrics for AI systems have to be well-defined, and my suspicion is that this makes them almost by definition solvable without something we humans would ever track as “intelligence.” But what does this matter? Sometimes the metric and the end goal are aligned, such as in the case of computer vision and speech recognition… But when they’re not, such as when the true end goal is something vague like “solving intelligence,” there may be many lessons learned but at least some AI researchers will simply be in the wrong classroom.
It’s been interesting seeing this debate unfold over the past few weeks, and I think I agree mostly with Brooks and Bailey. I’d love to know what you all think, so here’s all the posts I mentioned above one more time if you’d like to read them all: