Dynamically Typed

Google AI's ethics crisis

Google AI is in the middle of an ethics crisis. Timnit Gebru, the AI ethics researcher behind Gender Shades (see DT #42), Datasheets for Datasets (#41), and much more, got pushed out of the company after a series of conflicts. Karen Hao for MIT Technology Review:

A series of tweets, leaked emails, and media articles showed that Gebru’s exit was the culmination of a conflict over [a critical] paper she co-authored. Jeff Dean, the head of Google AI, told colleagues in an internal email (which he has since put online) that the paper “didn’t meet our bar for publication” and that Gebru had said she would resign unless Google met a number of conditions, which it was unwilling to meet. Gebru tweeted that she had asked to negotiate “a last date” for her employment after she got back from vacation. She was cut off from her corporate email account before her return.

See Casey Newton’s coverage on his Platformer newsletter for both Gebru’s and Jeff Dean’s emails (and here for his extended statement). This story unfolded over the past week and is probably far from over, but from everything I’ve read so far — which is a __lot, hence this email hitting your inbox a bit later than usual — I think think Google management made the wrong call here. Their statement on the matter focuses on missing references in Gebru’s paper, but as Google Brain Montreal researcher Nicolas Le Roux points out:

… [The] easiest way to discriminate is to make stringent rules, then to decide when and for whom to enforce them. My submissions were always checked for disclosure of sensitive material, never for the quality of the literature review.

This is echoed by a top comment on HackerNews. From Gebru’s email, it sounds like frustrations had been building up for some time, and that the lack of transparency surrounding the internal rejection of this paper was simply the final straw. I think it would’ve been more productive for management to start a dialog with Gebru here — forcing a retraction, “accepting her resignation” immediately and then cutting off her email only served to escalate the situation.

Gebru’s research on the biases of large (compute-intensive) vision and language models is much harder to do without the resources of a large company like Google. This is a problem that academic ethics researchers often run into; OpenAI’s Jack Clark, who gave feedback on Gebru’s paper, has also pointed this out. I always found it admirable that Google AI, as a research organization, intellectually had the space for voices like Gebru’s to critically investigate these things. It’s a shame that it was not able to sustain an environment in which this could be fostered.

In the end, beside the ethical issues, I think Google’s handling of this situation was also a big strategic misstep. 1500 Googlers and 2100 others have signed an open letter supporting Gebru. Researchers from UC Berkeley and the University of Washington said this will have “a chilling effect” on the field. Apple and Twitter are publicly poaching Google’s AI ethics researchers. Even mainstream outlets like The Washington Post and The New York Times have picked up the story. In the week leading up to NeurIPS and the Black in AI workshop there, is this a better outcome for Google AI than letting an internal researcher submit a conference paper critical of large language models?

Photoshop's Neural Filters

Light direction is one of many new AI-powered features in Photoshop; in the middle picture, the light source is on the left; in the right picture, it’s moved to the right.

Light direction is one of many new AI-powered features in Photoshop; in the middle picture, the light source is on the left; in the right picture, it’s moved to the right.

Adobe’s latest Photoshop release is jam-packed with AI-powered features. The pitch, by product manager Pam Clark:

You already rely on artificial intelligence features in Photoshop to speed your work every day like Select Subject, Object Selection Tool, Content-Aware Fill, Curvature Pen Tool, many of the font features, and more. Our goal is to systematically replace time-intensive steps with smart, automated technology wherever possible. With the addition of these five major new breakthroughs, you can free yourself from the mundane, non-creative tasks and focus on what matters most – your creativity.

Adobe is branding the most exciting of these new features as Neural Filters : neural-network-powered image manipulations that are parameterized by sliders in the Photoshop UI. Some of them automate tasks that were previously very labor-intensive, while others enable changes that were previously impossible. Here’s a few of both:

  • Style transfer: apply one photo’s style to another, like the classic “make this look like a Picasso / Van Gogh / Monet.”
  • Smart portraits: subtly change a photo subject’s age, expression, gaze direction, pose, hair thickness, etc.
  • Colorize: infer colors for black-and-white photos based on their contents.
  • JPEG Artifacts Removal: smooth out the blocky artifacts that occur on patches of JPEG-compressed photos.

These all run on-device and came out of a collaboration between Adobe Research and NVIDIA, implying they’re best suited to machines with beefy GPUs — not surprising. However, the blog post is a little vague in about the specifics here (“performance is particularly fast on desktops and notebooks with graphics acceleration”), so I wonder whether this Neural Filters is also optimized for any other AI accelerator chips that Adobe can’t mention yet. In particular, Apple recently showed off their new A14 chips that feature a much faster Neural Engine. These chips launched in the latest iPhones and iPads but will also be in a new line of non-Intel “Apple Silicon” Macs, rumored to be announced next month — what are the chances that Apple will boast about the performance of Neural Filters on the Neural Engine during the presentation? I’d say pretty big. (Maybe worthy of a Ricky, even?)

Anyway, this Photoshop release is exactly the kind of productized AI that I started DT to cover: advanced machine learning models — that only a few years ago were just cool demos at conferences — wrapped up in intuitive UIs that fit into users’ existing workflows. It’s now just as easy to tweak the intensity of a smile or the direction of a gaze in a portrait photo as it is to manipulate its hue or brightness. That’s pretty amazing.

OpenAI and Microsoft: GPT-3 and beyond

OpenAI is exclusively licensing GPT-3 to Microsoft. What does this mean for their future relationship?

GPT-3 is OpenAI’s latest gargantuan language model (see DT #42) that’s uniquely capable of performing many different “text-in, text-out” tasks — demos range from imitating famous writers to generating code (#44) — without needing to be fine-tuned: its crazy scale makes it a few-shot learner.

In July 2019, OpenAI announced it got a $1 billion investment from Microsoft. Back then, this raised some eyebrows in the (academic) machine learning community, which can sometimes be a bit allergic to the commercialization of AI (#19). The exact terms of the investment were never disclosed, but some key elements of the deal were. Tom Simonite for WIRED:

Most interesting bit of the OpenAI announcement: “we intend to license some of our pre-AGI technologies, with Microsoft becoming our preferred partner.”

Now, a year and a bit later, that’s exactly what happened. From the OpenAI blog:

In addition to offering GPT-3 and future models via the OpenAI API, and as part of a multiyear partnership announced last year, OpenAI has agreed to license GPT-3 to Microsoft for their own products and services.

What does that mean? Nick Statt for The Verge:

A Microsoft spokesperson tells The Verge that its exclusive license gives it unique access to the underlying code of GPT-3, which contains technical advancements it hopes to integrate into its products and services.

In their blog post, Microsoft pitches this as a way to “expand [their] Azure-powered AI platform in a way that democratizes AI technology,” to which the community again reacted negatively: if you want to democratize AI, why not just open-source GPT-3’s code and training data?* I agree that “democratizing” is a bit of a stretch, but I think there’s a much more interesting discussion to be had here than the one on a self-congratulatory word choice in a corporate press release. Perhaps ironically, that discussion also starts from overanalyzing another few words in that very same press release.

According to Microsoft’s blog post about the licensing deal, GPT-3 “is trained on Azure’s AI supercomputer.” I wonder if that means OpenAI is now using Microsoft’s open-source DeepSpeed library (#34) to train its GPT models. DeepSpeed is a library for distributed training of enormous ML models that has specific features to support training large Transformers; Microsoft Research claimed in May that it’s capable of training models with up to 170 billion parameters (#40). GPT-3 is a 175-billion-parameter Transformer that was released in June, just one month later. That seems unlikely to be a coincidence, and Microsoft’s latest DeepSpeed update (#49) even includes some experimental work using the GPT-3 architecture.

So this suggests that the partnership goes beyond just the exchange of Microsoft’s money and compute for OpenAI’s trained models and ML brand strength (an exchange of cloud for clout, if you will) that we previously expected. Are the companies actually also deeply collaborating on ML and systems engineering research? I’d love to find out.

If so, this could be an early indication that Microsoft — who I’m sure is at least a little bit envious of Google’s ownership of DeepMind — will eventually want to acquire OpenAI. And it could be a great fit. Looking at Microsoft’s recent acquisition history, it has so far let GitHub (which it acquired two years ago) continue to operate largely autonomously. This makes it an attractive potential parent company for OpenAI: the lab probably wouldn’t have to give up too much of its independence under Microsoft’s stewardship. So unless OpenAI actually invents and monetizes some form of artificial general intelligence (AGI) in the next five to ten years — which I don’t think they will — I wouldn’t be surprised if they end up becoming Microsoft’s DeepMind.

* One big reason for not open-sourcing GPT-3’s code and data is security; see my coverage of OpenAI’s staged release strategy for GPT-2 (#8, #13, #22, #27).

Autonomous trucks will be the first big self-driving market

Autonomous trucking is where I think self-driving vehicle technology will have its first big impact , much before e.g. the taxi or ride sharing industries. Long-distance highway truck driving — with hubs at city borders where human drivers take over — is a much simpler problem to solve than inner-city taxi driving. Beyond the obvious lower complexity of not having to deal with traffic lights, small streets and pedestrians, a specific highway route between two high-value hubs can also be mapped in high detail much more economically than an ever-changing city center could. And, of course, self-driving trucks won’t have the 11-hour-per-day driving safety limit imposed on human drivers. This all makes for quite an attractive pitch when taken together.

In recent news, Jennifer Smith at the Wall Street Journal reported that startup Ike Robotics has reservations for its first 1,000 heavy-duty autonomous trucks, from “transport operators Ryder System Inc., NFI Industries Inc. and the U.S. supply-chain arm of German logistics giant Deutsche Post AG.”

Tapping into big carriers’ logistics networks and operational expertise means Ike can focus on the technology piece—systems engineering, safety and technical challenges such as computer vision—said Chief Executive Alden Woodrow.

“They are going to help us make sure we build the right product, and we are going to help them prepare to adopt it and be successful,” said Mr. Woodrow, who worked on self-driving trucks at Uber Technologies Inc. before co-founding Ike in 2018.

Unlike rival startups, Ike wants to be a software-as-a-service provider of self-driving tech for existing logistics operators, instead of becoming one themselves. It’ll be interesting to see how well this business model works out when competitors start offering a similar service — the biggest question is how easy or hard it’ll be for an operator to swap one self-driving SaaS out for another. If it’s easy, that’ll make for a very competitive space.

(On the disruption side: there are nearly 3 million truck drivers in the United States alone, so widespread automation here can be quite impactful. Until today, I thought trucking was the biggest profession in most US states because of this 2015 NPR article, but apparently that was based on wrongly interpreted statistics; the most common job is in retail — no surprise there. Nonetheless, trucking is currently a major profession. A decade from now it may no longer be.)

The deepfake detection ratrace

Microsoft is launching Video Authenticator , an app that helps organizations “involved in the democratic process” detect deepfakes — videos that make people look like they’re saying things they’ve never said by superimposing automatically-generated voice tracks and face movements over real videos. Deepfakes are usually made using generative adversarial networks (GANs) like those in Samsung AI’s neural avatars project (see DT #15) and in the popular open-source DeepFaceLab app.

Because of all the obvious ways in which deepfakes can be abused, this has been a popular research area for technology platform companies: a bit over a year ago, Facebook launched their deepfake detection challenge and Google contributed to TU Munich’s FaceForensics benchmark (#23). Microsoft has now productized these research efforts with Video Authenticator. The app checks photos and videos for the “subtle fading or greyscale elements” that may occur at a deepfake’s blending boundary — where the fake facial movements mix in with the real background media — and gives users a confidence score for whether a face is manipulated. This happens in real-time and frame-by-frame for videos, which I imagine will be particularly useful for detecting subtle fakery, like a mostly-real video with a few small tweaks that change its message.

Video Authenticator initially won’t be made publicly available. Instead, Microsoft is privately distributing it to news outlets, political campaigns, and media companies through the AI Foundation’s Reality Defender 2020 program, “which will guide organizations through the limitations and ethical considerations inherent in any deepfake detection technology.” This makes sense, since deepfakes represent a typical cat-and-mouse AI security game — new models will surely be trained specifically to fool Video Authenticator, which this limited release approach attempts to slow down.

I’d be interested to learn about how organizations integrate Video Authenticator into their existing workflows for validating the veracity of newsworthy videos. I haven’t really come across any examples of big-name news organizations getting fooled by deepfakes yet, but I imagine it’s much more common on social media where videos aren’t vetted by journalists before being shared.