#50: Microsoft exclusively licenses OpenAI's GPT-3, Amsterdam's AI registry, and cloud nowcasting progress
Hey everyone, welcome to the 50th (đ) edition of Dynamically Typed!
One of the things Iâm most proud of with this newsletter is that itâs allowed me to build a big corpus of my own previous writing to refer back to. A while ago, I wrote a Python script to download DT issues into a repository of markdown files. Now that this repository has grown into 50 issues going back almost two years, I find myself searching through it several times a week and rediscovering tools, articles, or papers. Itâs also super useful for finding connections like the one between GPT-3 and DeepSpeed from the first story below. (And Iâm using it for another new project coming soon â stay tuned!)
Anyway, enough of that: letâs get into the last two weeks of productized AI, ML research, climate AI, and cool stuff.
In todayâs issue, I wrote a Productized AI story on Microsoftâs exclusive license of OpenAIâs GPT-3 model and what this may tell us about their future relationship; I also have a link to Amsterdamâs and Helsinkiâs new AI algorithm registries. For ML research, Iâve got quick links to the new Papers with Code feature on arXiv, Black in AIâs new academic program, a tool called TensorSensor, and a new Transformers paper thatâs making the rounds on Twitter. I found some recent cloud nowcasting work for climate AI. And, finally, for cool stuff I have links to a short story about AI scooters, and NVIDIAâs new Imaginaire library.
Productized Artificial Intelligence đ
OpenAI is exclusively licensing GPT-3 to Microsoft. What does this mean for their future relationship?
GPT-3 is OpenAIâs latest gargantuan language model (see DT #42) thatâs uniquely capable of performing many different âtext-in, text-outâ tasks â demos range from imitating famous writers to generating code (#44) â without needing to be fine-tuned: its crazy scale makes it a few-shot learner.
In July 2019, OpenAI announced it got a $1 billion investment from Microsoft. Back then, this raised some eyebrows in the (academic) machine learning community, which can sometimes be a bit allergic to the commercialization of AI (#19). The exact terms of the investment were never disclosed, but some key elements of the deal were. Tom Simonite for WIRED:
Most interesting bit of the OpenAI announcement: âwe intend to license some of our pre-AGI technologies, with Microsoft becoming our preferred partner.â
Now, a year and a bit later, thatâs exactly what happened. From the OpenAI blog:
In addition to offering GPT-3 and future models via the OpenAI API, and as part of a multiyear partnership announced last year, OpenAI has agreed to license GPT-3 to Microsoft for their own products and services.
What does that mean? Nick Statt for The Verge:
A Microsoft spokesperson tells The Verge that its exclusive license gives it unique access to the underlying code of GPT-3, which contains technical advancements it hopes to integrate into its products and services.
In their blog post, Microsoft pitches this as a way to âexpand [their] Azure-powered AI platform in a way that democratizes AI technology,â to which the community again reacted negatively: if you want to democratize AI, why not just open-source GPT-3âs code and training data?* I agree that âdemocratizingâ is a bit of a stretch, but I think thereâs a much more interesting discussion to be had here than the one on a self-congratulatory word choice in a corporate press release. Perhaps ironically, that discussion also starts from overanalyzing another few words in that very same press release.
According to Microsoftâs blog post about the licensing deal, GPT-3 âis trained on Azureâs AI supercomputer.â I wonder if that means OpenAI is now using Microsoftâs open-source DeepSpeed library (#34) to train its GPT models. DeepSpeed is a library for distributed training of enormous ML models that has specific features to support training large Transformers; Microsoft Research claimed in May that itâs capable of training models with up to 170 billion parameters (#40). GPT-3 is a 175-billion-parameter Transformer that was released in June, just one month later. That seems unlikely to be a coincidence, and Microsoftâs latest DeepSpeed update (#49) even includes some experimental work using the GPT-3 architecture.
So this suggests that the partnership goes beyond just the exchange of Microsoftâs money and compute for OpenAIâs trained models and ML brand strength (an exchange of cloud for clout, if you will) that we previously expected. Are the companies actually also deeply collaborating on ML and systems engineering research? Iâd love to find out.
If so, this could be an early indication that Microsoft â who Iâm sure is at least a little bit envious of Googleâs ownership of DeepMind â will eventually want to acquire OpenAI. And it could be a great fit. Looking at Microsoftâs recent acquisition history, it has so far let GitHub (which it acquired two years ago) continue to operate largely autonomously. This makes it an attractive potential parent company for OpenAI: the lab probably wouldnât have to give up too much of its independence under Microsoftâs stewardship. So unless OpenAI actually invents and monetizes some form of artificial general intelligence (AGI) in the next five to ten years â which I donât think they will â I wouldnât be surprised if they end up becoming Microsoftâs DeepMind.
- One big reason for not open-sourcing GPT-3âs code and data is security; see my coverage of OpenAIâs staged release strategy for GPT-2 (#8, #13, #22, #27).
Quick productized AI links đ
- đ Amsterdam (where I live!) and Helsinki (where I donât live) have launched their âAI algorithm registries.â These are actually a pretty cool idea: whenever a municipalities âutilizes algorithmic systems as part of [their] city services,â these systems must be cataloged in the cityâs algorithm registry. Amsterdamâs registry currently has three entries: (1) license plate-recognizing automated parking control cars, (2) a pilot for algorithm-assisted fraud surveillance for holiday home rentals, and (3) a natural language processing system for categorizing reports of trash in public space. These registries may become a good source of productized AI links for me, but more importantly, this is a great step for building transparency, trust and accountability into these systems.
Machine Learning Research đ
- đź An Image is Worth 16x16 words: Transformers for Image Recognition at Scale is a paper under review for ICLR 2020 thatâs been making the rounds on Twitter. I found Yannick Kilcherâs explainer video â which starts with a lovely rant about âdouble-blindâ peer review â a good introduction to the model, which could be the start of Transformers overtaking convolutional models at the very largest scales of computer vision models.
- đŠđžâđť Building on their previous three years of graduate school application mentoring programs, Black in AI has launched an Academic Positions program to support Black junior researchers getting started in âcareers in academia, industry, and policy.â The launch blog post includes details about the program, tips on how academics and organizations can support it, and lots of additional resources. This is a great link to amplify within your ML network!
- âĄď¸TensorSensor is a Python package that âclarifiesâ (visualizes) the dimensions of tensors in numpy, TensorFlow or PyTorch. I recently had to reproduce a paper that wrote down its math in a simplified form that ignored the out-channel dimension of convolutional filters, and spent a lot of time trying to get all my matrices to line up correctly with that extra dimension. This tool wouldâve made that a lot easier! Also check out Terence Parrâs introduction to TensorSensor.
- đ Cool new arXiv.org feature: Papers with Code-discovered implementations are now linked right on a paperâs abstract page. Iâve always found it quite easy to find any available implementations with a few quick Google and GitHub searches, but integrations like this are great for discoverability.
Artificial Intelligence for the Climate Crisis đ
- â ď¸ Some recent progress on nowcasting (predicting over the next few hours) the locations of clouds: Berthomier et al. (2020) compared the effectiveness of several deep learning models for the task, and Jack Kelly of Open Climate Fix open-sourced a Python notebook that approaches the same problem using optimal flow. This work is a step towards improving predictions of solar panelsâ power output, an important task for operators as an increasing fraction of the electricity supply on their grids transitions to solar.
Cool Things â¨
- đ´ I came across this short story by Janelle Shane when it premiered as a New York Times âOp-Ed From the Futureâ last year, but forgot to share it at the time. I rediscovered and reread it this week, and I still think itâs delightful: We Shouldnât Bother the Feral Scooters of Central Park.
- đ¨ Imaginaire is NVIDIAâs universal library for image and video synthesis, including algorithms such as SPADE (GauGAN), pix2pixHD, MUNIT, FUNIT, COCO-FUNIT, vid2vid, few-shot vid2vid. Check out this demo video to see what itâs capable of, from summer-to-winter transformations to automatically animating motion into pictures.
Thanks for reading! As usual, you can let me know what you thought of todayâs issue using the buttons below or by replying to this email. If youâre new here, check out the Dynamically Typed archives or subscribe below to get a new issues in your inbox every second Sunday.
If you enjoyed this issue of Dynamically Typed, why not forward it to a friend? Itâs by far the best thing you can do to help me grow this newsletter. đ§