#40: Pinterest's ML for board organization, GAN-aided pixel art, and Bayesian optimization gets the Distill treatment
Hey everyone, welcome to Dynamically Typed #40! Today in productized AI I’m covering Pinterest’s new ML-based organization feature, and I have links to a meeting notes transcription tool and 3D photos on Facebook and the iPhone SE. On the ML research side, I wrote about Distill’s new Bayesian optimization paper, and I have links to several new model visualization and training tools. For climate AI there’s Google’s commitment to stop making AI tools for oil and gas extraction, as well as an update on Open Climate Fix. Finally, for cool things I found pixel-me, a site that generates 8-bit style portraits based on photos, and a cool demo of self-learning digital squids. Let’s dive in!
Productized Artificial Intelligence 🔌
Pinterest’s UX flow for ML-based grouping within boards. (Pinterest Engineering Blog.)
Pinterest has added new AI-powered functionality for grouping images and other pins on a board. The social media platform is mostly centered around finding images and collecting (pinning) them on boards. After working on a board for a while, though, some users may pin so much that they no longer see the forest for the trees. That’s where this new feature comes in:
For example, maybe a Pinner is new to cooking but has been saving hundreds of recipe Pins. With this new tool, Pinterest may suggest board sections like “veggie meals” and “appetizers” to help the Pinner organize their board into a more actionable meal plan.
Here’s how it works:
- When a user views a board that has a potential grouping, a suggestion pops up showing the suggested group and a few sample pins.
- If the user taps it, the suggestion expands into a view with all the suggested pins, where she can deselect any pins she does not want to add to the group. (Which I’m sure is very valuable training data!)
- The user can edit the name for the section, and then it gets added to her board.
Coming up with potential groupings is a three-step process. First, a graph convolutional network called PinSage computes an embedding based on text associated with the pin, visual features extracted from the image, and the graph structure. Then the Ward clustering algorithm (chosen because it does not require a predefined number of clusters) generates potential groups. Finally, a filtered count of common annotations for pins in the group decides the proposed group name.
Pinterest has really been on a roll lately with adding AI-powered features to its apps, including visual search (DT #23) and AR try-on for shopping (DT #33). This post by Dana Yakoobinsky and Dafang He on the company’s engineering blog has the full details on their implementation of this latest feature, as well as some future plans to expand it.
Quick productized AI links 🔌
- 🦦 Otter.ai auto-generates “rich notes for meetings, interviews, lectures, and other important conversations.” This looks like a fun product, and apparently it’s being integrated into Zoom. (Unrelated: the otter emoji may be the purest emoji in existence.)
- 📱 Ben Sandofsky of iOS camera app Halide wrote a deep dive on how the new iPhone SE, which has only one rear-facing camera, uses single-image monocular depth estimation to do fake background blur in portrait mode photos. I have some experience with this exact computer vision task, and the results achieved here by Apple—on-device!—look very impressive to me.
- 📸 Related: Facebook’s 3D photos feature now simulates depth for any image, using techniques very similar to what the iPhone SE is doing.
Machine Learning Research 🎛
Bayesian optimization of finding gold along a line, using the probability of improvement (PI) acquisition function. (Agnihotri and Batra, 2020.)
Apoorv Agnihotri and Nipun Batra wrote an article Exploring Bayesian Optimization for Distill. This technique is used in hyperparameter optimization, where evaluating any one point—like the combination of a learning rate, a weight decay factor, and a data augmentation setting—is expensive: you need to train your entire model to know how well the hyperparameters performed.
This is where Bayesian optimization comes in. It centers around answering the question “Based on what we know so far, what point should we evaluate next?” The process uses acquisition functions to trade off exploitation (looking at points in the hyperparameter space that we think are likely to be good) with exploration (looking at points we’re very uncertain about). Given an appropriate acquisition function and priors, it can help find a good point in the space in surprisingly few iterations.
Bayesian optimization was one of the tougher subjects to wrap my head around in graduate school, so I was very excited to see it get the Distill treatment. Agnihotri and Batra explain the process through an analogy of picking the best places to dig for gold which, incidentally, was also one of its first real-world applications in the 1950s! You can read the full explainer here; also check out DragonFly and BoTorch, two tools for automated Bayesian optimization from my ML resources list.
Quick ML research + resource links 🎛 (see all 63 resources)
- ⚡️ Google’s People + AI Research (PAIR) group has a set of open-source tools and platforms “that make ML models more understandable, trustworthy, and fair,” including several model visualization and feature attribution projects.
- 🏎 Microsoft released the second version of DeepSpeed and its Zero Redundancy Optimizer (ZeRO-2, see DT #34). These improvements enable training models that are an order of magnitude larger and faster than previously possible: up to 170 billion parameters, at up to 10x previous state-of-the-art speeds. It’s open-source on GitHub at microsoft/DeepSpeed.
- 📣 Google also proposed a new optimization technique: speeding up neural network training with data echoing. It’s quite simple: while one bottlenecked part of the training pipeline is getting the next input ready, the current input gets “echoed” through the rest of the model graph, reducing training time while preserving predictive performance. This is cool work, and hopefully it’ll get upstreamed to TensorFlow for everyone to use.
Artificial Intelligence for the Climate Crisis 🌍
Just some quick links today.
Quick climate AI links 🌍
- 🛢 After tech companies came under fire a few months ago for their work with oil and gas companies, leading to the #techwontdrill it pledge (see DT #33), Google has now committed that it will no longer enter into new agreements to “build custom AI/ML algorithms to facilitate upstream extraction in the oil and gas industry.” This information comes from a new Greenpeace report which I’ll cover in more detail in a future edition of DT.
- 🌞 Flo Wirtz of Open Climate Fix wrote an update on their solar PV nowcasting (how much electricity will be produced in the next few hours?) and mapping (where are all solar panels located?) projects. Both projects hit quite a few proof points, including data acquisition and external validation; they also have a new co-funding award from the European Space Agency.
Cool Things ✨
Pixel-me at graduation.
Japanese developer Sato released pixel-me.tokyo, a website that generates an 8-bit style portrait based on user-submitted photos. It works both on human faces and pets. Sato also previously made AI Gahaku (“AI master painter”), which creates portraits in the style of classical painters. He built both projects using pix2pix, a conditional generative adversarial network (cGAN) that’s commonly used for AI art projects. Google developer advocate Kaz Sato (“similar names, but we are not the same person”) wrote up a nice piece for the Google Cloud blog about Sato’s process the projects.
Quick cool things links ✨
- 🦑 Job Talle used spiking neural networks and neuroevolution to create digital squids that learn how to swim. Check out the demo on his site; let it “warp” to about generation #1000 and then watch how the different squids learned (and failed to learn) to swim in different ways. I always think demos like this would be super cool to stylize and project on my wall as an ever-changing piece of art. One day.
Thanks for reading! As usual, you can let me know what you thought of today’s issue using the buttons below or by replying to this email. If you’re new here, check out the Dynamically Typed archives or subscribe below to get a new issues in your inbox every second Sunday.
If you enjoyed this issue of Dynamically Typed, why not forward it to a friend? It’s by far the best thing you can do to help me grow this newsletter. 🌬