#50: Microsoft exclusively licenses OpenAI's GPT-3, Amsterdam's AI registry, and cloud nowcasting progress
Hey everyone, welcome to the 50th (🎉) edition of Dynamically Typed!
One of the things I’m most proud of with this newsletter is that it’s allowed me to build a big corpus of my own previous writing to refer back to. A while ago, I wrote a Python script to download DT issues into a repository of markdown files. Now that this repository has grown into 50 issues going back almost two years, I find myself searching through it several times a week and rediscovering tools, articles, or papers. It’s also super useful for finding connections like the one between GPT-3 and DeepSpeed from the first story below. (And I’m using it for another new project coming soon — stay tuned!)
Anyway, enough of that: let’s get into the last two weeks of productized AI, ML research, climate AI, and cool stuff.
In today’s issue, I wrote a Productized AI story on Microsoft’s exclusive license of OpenAI’s GPT-3 model and what this may tell us about their future relationship; I also have a link to Amsterdam’s and Helsinki’s new AI algorithm registries. For ML research, I’ve got quick links to the new Papers with Code feature on arXiv, Black in AI’s new academic program, a tool called TensorSensor, and a new Transformers paper that’s making the rounds on Twitter. I found some recent cloud nowcasting work for climate AI. And, finally, for cool stuff I have links to a short story about AI scooters, and NVIDIA’s new Imaginaire library.
Productized Artificial Intelligence 🔌
OpenAI is exclusively licensing GPT-3 to Microsoft. What does this mean for their future relationship?
GPT-3 is OpenAI’s latest gargantuan language model (see DT #42) that’s uniquely capable of performing many different “text-in, text-out” tasks — demos range from imitating famous writers to generating code (#44) — without needing to be fine-tuned: its crazy scale makes it a few-shot learner.
In July 2019, OpenAI announced it got a $1 billion investment from Microsoft. Back then, this raised some eyebrows in the (academic) machine learning community, which can sometimes be a bit allergic to the commercialization of AI (#19). The exact terms of the investment were never disclosed, but some key elements of the deal were. Tom Simonite for WIRED:
Most interesting bit of the OpenAI announcement: “we intend to license some of our pre-AGI technologies, with Microsoft becoming our preferred partner.”
Now, a year and a bit later, that’s exactly what happened. From the OpenAI blog:
In addition to offering GPT-3 and future models via the OpenAI API, and as part of a multiyear partnership announced last year, OpenAI has agreed to license GPT-3 to Microsoft for their own products and services.
What does that mean? Nick Statt for The Verge:
A Microsoft spokesperson tells The Verge that its exclusive license gives it unique access to the underlying code of GPT-3, which contains technical advancements it hopes to integrate into its products and services.
In their blog post, Microsoft pitches this as a way to “expand [their] Azure-powered AI platform in a way that democratizes AI technology,” to which the community again reacted negatively: if you want to democratize AI, why not just open-source GPT-3’s code and training data?* I agree that “democratizing” is a bit of a stretch, but I think there’s a much more interesting discussion to be had here than the one on a self-congratulatory word choice in a corporate press release. Perhaps ironically, that discussion also starts from overanalyzing another few words in that very same press release.
According to Microsoft’s blog post about the licensing deal, GPT-3 “is trained on Azure’s AI supercomputer.” I wonder if that means OpenAI is now using Microsoft’s open-source DeepSpeed library (#34) to train its GPT models. DeepSpeed is a library for distributed training of enormous ML models that has specific features to support training large Transformers; Microsoft Research claimed in May that it’s capable of training models with up to 170 billion parameters (#40). GPT-3 is a 175-billion-parameter Transformer that was released in June, just one month later. That seems unlikely to be a coincidence, and Microsoft’s latest DeepSpeed update (#49) even includes some experimental work using the GPT-3 architecture.
So this suggests that the partnership goes beyond just the exchange of Microsoft’s money and compute for OpenAI’s trained models and ML brand strength (an exchange of cloud for clout, if you will) that we previously expected. Are the companies actually also deeply collaborating on ML and systems engineering research? I’d love to find out.
If so, this could be an early indication that Microsoft — who I’m sure is at least a little bit envious of Google’s ownership of DeepMind — will eventually want to acquire OpenAI. And it could be a great fit. Looking at Microsoft’s recent acquisition history, it has so far let GitHub (which it acquired two years ago) continue to operate largely autonomously. This makes it an attractive potential parent company for OpenAI: the lab probably wouldn’t have to give up too much of its independence under Microsoft’s stewardship. So unless OpenAI actually invents and monetizes some form of artificial general intelligence (AGI) in the next five to ten years — which I don’t think they will — I wouldn’t be surprised if they end up becoming Microsoft’s DeepMind.
- One big reason for not open-sourcing GPT-3’s code and data is security; see my coverage of OpenAI’s staged release strategy for GPT-2 (#8, #13, #22, #27).
Quick productized AI links 🔌
- 🗂 Amsterdam (where I live!) and Helsinki (where I don’t live) have launched their “AI algorithm registries.” These are actually a pretty cool idea: whenever a municipalities “utilizes algorithmic systems as part of [their] city services,” these systems must be cataloged in the city’s algorithm registry. Amsterdam’s registry currently has three entries: (1) license plate-recognizing automated parking control cars, (2) a pilot for algorithm-assisted fraud surveillance for holiday home rentals, and (3) a natural language processing system for categorizing reports of trash in public space. These registries may become a good source of productized AI links for me, but more importantly, this is a great step for building transparency, trust and accountability into these systems.
Machine Learning Research 🎛
- 🖼 An Image is Worth 16x16 words: Transformers for Image Recognition at Scale is a paper under review for ICLR 2020 that’s been making the rounds on Twitter. I found Yannick Kilcher’s explainer video — which starts with a lovely rant about “double-blind” peer review — a good introduction to the model, which could be the start of Transformers overtaking convolutional models at the very largest scales of computer vision models.
- 👩🏾💻 Building on their previous three years of graduate school application mentoring programs, Black in AI has launched an Academic Positions program to support Black junior researchers getting started in “careers in academia, industry, and policy.” The launch blog post includes details about the program, tips on how academics and organizations can support it, and lots of additional resources. This is a great link to amplify within your ML network!
- ⚡️TensorSensor is a Python package that “clarifies” (visualizes) the dimensions of tensors in numpy, TensorFlow or PyTorch. I recently had to reproduce a paper that wrote down its math in a simplified form that ignored the out-channel dimension of convolutional filters, and spent a lot of time trying to get all my matrices to line up correctly with that extra dimension. This tool would’ve made that a lot easier! Also check out Terence Parr’s introduction to TensorSensor.
- 🔗 Cool new arXiv.org feature: Papers with Code-discovered implementations are now linked right on a paper’s abstract page. I’ve always found it quite easy to find any available implementations with a few quick Google and GitHub searches, but integrations like this are great for discoverability.
Artificial Intelligence for the Climate Crisis 🌍
- ⛅️ Some recent progress on nowcasting (predicting over the next few hours) the locations of clouds: Berthomier et al. (2020) compared the effectiveness of several deep learning models for the task, and Jack Kelly of Open Climate Fix open-sourced a Python notebook that approaches the same problem using optimal flow. This work is a step towards improving predictions of solar panels’ power output, an important task for operators as an increasing fraction of the electricity supply on their grids transitions to solar.
Cool Things ✨
- 🛴 I came across this short story by Janelle Shane when it premiered as a New York Times “Op-Ed From the Future” last year, but forgot to share it at the time. I rediscovered and reread it this week, and I still think it’s delightful: We Shouldn’t Bother the Feral Scooters of Central Park.
- 🎨 Imaginaire is NVIDIA’s universal library for image and video synthesis, including algorithms such as SPADE (GauGAN), pix2pixHD, MUNIT, FUNIT, COCO-FUNIT, vid2vid, few-shot vid2vid. Check out this demo video to see what it’s capable of, from summer-to-winter transformations to automatically animating motion into pictures.
Thanks for reading! As usual, you can let me know what you thought of today’s issue using the buttons below or by replying to this email. If you’re new here, check out the Dynamically Typed archives or subscribe below to get a new issues in your inbox every second Sunday.
If you enjoyed this issue of Dynamically Typed, why not forward it to a friend? It’s by far the best thing you can do to help me grow this newsletter. 🧀