#15: Neural avatars, AI on the edge, and Apple's new Create ML app
Edge intelligence is the idea of doing machine learning locally on a device instead of on a server. Take the task of searching images based on detected objects, for example, like finding all your photos of dogs. If you use Google Photos, your pictures are uploaded to the cloud, where Google’s servers process and tag them, potentially as “dog” or “cat.” If you use Apple’s Photos app, on the other hand, your phone does this processing and tagging itself (usually at night, when it’s plugged in). Apple’s implementation is an example of edge intelligence, and it has several advantages:
- The whole process can work offline, without an internet connection.
- Apple never gets to see your (unencrypted) images, which is great for privacy.
- Apple saves on server costs, offloading the expensive machine learning computations and electricity bill to your iPhone.
This week, there were quite a few announcements around edge intelligence: on the product side, Apple launched new versions of Create ML and Core ML at WWDC, and Nvidia launched its EGX Platform; on the research side, there was the MicroNet Challenge, SpArSe, and EfficientNet.
Other news includes Samsung’s neural avatars, Wadhwani AI’s pest detection project, and an amazing tool to colorize black-and-white photos. Finally, I’ve launched a brand new version of my personal website! Let’s dive in.
Productized Artificial Intelligence 🔌
The new Create ML app. (Apple)
Apple launched a new version of Create ML at WWDC, its worldwide developers conference:
The new, user-friendly Create ML app lets you build, train, and deploy machine learning models with no machine learning expertise required. Create ML lets you view model creation workflows in real time. You can even train models from Apple with your custom data, with no need for a dedicated server.
Create ML has built-in support for training machine learning models for object detection, activity and sound classification, and recommender systems. App developers can take these trained models and deploy them inside their apps using Apple’s Core ML APIs. The latter can now also train models on-device, enabling personalized machine learning models for iOS app users. More:
- Apple Developer site: Training Models with Create ML
- Khari Johnson for VentureBeat: Apple debuts Core ML 3 with on-device machine learning
Related, Nvidia launched its own edge intelligence platform, EGX. These new low-cost, low-power chips can be deployed “from factory floors to store aisles” to analyze data without having to send it to a central server farm:
EGX starts with the tiny NVIDIA® Jetson Nano™, which in a few watts can provide one-half trillion operations per second (TOPS) of processing for tasks such as image recognition. And it spans all the way to a full rack of NVIDIA T4 servers, delivering more than 10,000 TOPS to serve hundreds of users for real-time speech recognition and other complex AI experiences.
I’m super excited for the future that chips like this will enable: for example, imagine hundreds low-cost video cameras that can be deployed throughout an area to detect anomalies (fires, leaks, break-ins, etc.), with the advantages of preserving privacy (the raw video never has to be centrally stored) and needing only a weak network connection (the device can just send an “I detected this!” signal instead of a full video stream for server-side analysis). More:
- Dean Takahashi for VentureBeat: Nvidia EGX takes AI computing to the edge of the network
- Nvidia product page: Real-Time AI at the Edge
Artificial Intelligence for the Climate Crisis 🌍
_ Rajesh Jain taking a photo of a pest trap. (Photo/caption: Google.org)_
Indian organization Wadhwani AI is using machine learning to identify crop pests from photos. According to senior director Rajesh Jain, they’ve already developed and tested algorithms for two major types of pests in India. The organization now received a grant from Google.org’s AI Impact Challenge (see DT #13 for other winners) to help tackle the challenges they face in deploying their tech more broadly:
We have the technology, but scaling it and making it accessible to farmers will be difficult. There are differences in literacy, diversity, cultural differences, climate differences. We need to scale our solution to address all the challenges.
How does this relate to the climate crisis? Increases in temperature and CO2 levels have been shown to cause population booms in several types of harmful insects. Project like this, which mitigate the impact of climate change on society, will become increasingly important as the earth continues to warm. Read more about Wadhwani AI’s work in Jen Harvey’s post for The Keyword: How AI could tackle a problem shared by a billion people.
Machine Learning Technology 🎛
Neural avatars; the face shape of the Target is projected onto the Source frame to create the Result. (Zakharov et al.)
Samsung AI researchers have developed few-shot neural avatar technology, allowing them to animate the position of one face onto another using just one or a few reference video frames. It represents the next step in using generative adversarial networks (GANs) to generate realistic-looking faces (see DT #4), this time combined with meta-learning on a large video dataset.
Zakharov et al. say their work is meant for telepresence applications, but their paper has raised some concerns. Many people fear that this technology can be used to more easily create deepfakes, realistic-looking videos of people saying and doing things they never said or did. In the description of their demo video, the authors contend that the democratization of things like video recording, audio editing and special effects had an overall positive impact on the world, and that this will also be the case for neural avatar technology. I do agree with them; the pros and cons are similar to the debate around open-sourcing OpenAI’s GPT-2 (see DT #8). (Notably, however, there’s no ethics section in the actual paper.) More:
- Devin Coldewey for TechCrunch: Mona Lisa frown: Machine learning brings old paintings and photos to life
- Zakharov et al. on arXiv: Few-Shot Adversarial Learning of Realistic Neural Talking Head Models
- Related, here’s another new impressive GAN from DeepMind that focuses on creating more diverse outputs on arXiv: Generating Diverse High-Fidelity Images with VQ-VAE-2
Edge intelligence research has heated up in the past few weeks. Here’s a few recent announcements around making neural networks much smaller:
There will be a MicroNet Challenge at NeurIPS 2019, where contestants can compete on three tasks: ImageNet classification, CIFAR-100 classification, and WikiText-103 language modeling. The goal is to reach a base level of quality for each task using the least amount of bytes to store the model and math operations to make predictions. This is an important step in pushing these commonly used neural networks to the edge. More: MicroNet Challenge.
Google AI came up with a new CNN layer scaling method using a “compound coefficient,” which researchers used to create EfficientNets. These networks “surpass state-of-the-art accuracy with up to 10x better efficiency (smaller and faster).” ICML 2019 paper on arXiv by Tan and Le: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks.
Arm Research’s SpArSe looks like it could compete in this challenge: the architecture search method can design convolutional neural networks (CNNs) “which generalize well, while also being small enough to fit onto memory-limited [microcontrollers].” Paper on arXiv by Fedorov et al.: SpArSe: Sparse Architecture Search for CNNs on Resource-Constrained Microcontrollers.
Quick ML resource links ⚡️ (see all 16)
- Microsoft’s TensorWatch embeds real-time visualizations for deep learning and reinforcement learning models in Jupyter notebooks, plus a bunch more. Link: microsoft/tensorwatch
- Google’s TensorNetwork is an open-source library for efficient tensor calculations. Link: google/tensornetwork
- Google launched Dataset Search, a companion to Google Scholar for finding datasets along with their metadata. Links: blog post, Dataset Search
Cool Things ✨
‘Migrant Mother’ by Dorothea Lange (left) colorized a baseline algorithm (center) and by DeOldify (right). (Jason Antic, fast.ai)
DeOldify is Jason Antic’s tool for colorizing black-and-white photos. It’s based on a self-attention generative adversarial network (SAGAN) written in fast.ai on PyTorch, impressively trained on just “a Linux box in a dining room … using only a single consumer GPU.” Together with collaborators Jeremy Howard and Uri Manor, Antic have now also extended the method into colorizing videos, using a different type of GAN to reduce flicker. More:
- Antic, Howard, and Manor’s deep dive for the fast.ai blog: Decrappification, DeOldification, and Super Resolution
- Try DeOldify yourself online in this Google Colab notebook: ImageColorizerColab.ipynb
My Stuff 😁
My brand-new website!
I’ve launched a new version of my personal website! I built the first version four years ago back in high school, in the form of a custom content management system built on Python (2 😅), jinja2, and Google App Engine. It became pretty cumbersome to keep up to date, so I decided to tear it down and start from scratch.
I built leonoverweel.com 2.0 in Hugo, which means that it’s now statically generated and all my content is stored in Markdown files. I can now add or change the simple text files on GitHub from anywhere (🙌 Working Copy), and once I commit and push changes they get built and deployed in a few seconds by Netlify. I’ll write up a more detailed post about the site soon, but for now you can check it out at the link below!
My brand-new website: ✨ leonoverweel.com ✨
Thanks for reading! As usual, you can let me know what you thought of today’s issue using the buttons below or by replying to this email. If you’re new here, check out the Dynamically Typed archives or subscribe below to get a new issues in your inbox every second Sunday.
If you enjoyed this issue of Dynamically Typed, why not forward it to a friend? It’s by far the best thing you can do to help me grow this newsletter. 😁