#49: Movement in autonomous trucking, Microsoft's DeepSpeed update, and Transformers as Graph Neural Networks
Hey everyone, welcome to Dynamically Typed #49! Today’s productized AI feature story is on autonomous trucks, which is where I think self-driving vehicle tech will have its first big impact; I’ve also got quick links to Facebook’s survey paper on AI systems for social network safety; and an ECCV workshop on fair facial recognition systems (spoiler: they’re still not fair).
For ML research, there’s an update to Microsoft’s DeepSpeed training library; an essay that frames Transformers as Graph Neural Networks; a paper on how researcher experience impacts model performance; and the release of the NumPy paper! Finally, for climate change AI I found a tree planting location recommender system for reforestation; and for cool stuff there’s a new bioinformatics paper that can reconstruct images of what your eyes are seeing based on a scan of your brain.
Productized Artificial Intelligence 🔌
Autonomous trucking is where I think self-driving vehicle technology will have its first big impact , much before e.g. the taxi or ride sharing industries. Long-distance highway truck driving — with hubs at city borders where human drivers take over — is a much simpler problem to solve than inner-city taxi driving. Beyond the obvious lower complexity of not having to deal with traffic lights, small streets and pedestrians, a specific highway route between two high-value hubs can also be mapped in high detail much more economically than an ever-changing city center could. And, of course, self-driving trucks won’t have the 11-hour-per-day driving safety limit imposed on human drivers. This all makes for quite an attractive pitch when taken together.
In recent news, Jennifer Smith at the Wall Street Journal reported that startup Ike Robotics has reservations for its first 1,000 heavy-duty autonomous trucks, from “transport operators Ryder System Inc., NFI Industries Inc. and the U.S. supply-chain arm of German logistics giant Deutsche Post AG.”
Tapping into big carriers’ logistics networks and operational expertise means Ike can focus on the technology piece—systems engineering, safety and technical challenges such as computer vision—said Chief Executive Alden Woodrow.
“They are going to help us make sure we build the right product, and we are going to help them prepare to adopt it and be successful,” said Mr. Woodrow, who worked on self-driving trucks at Uber Technologies Inc. before co-founding Ike in 2018.
Unlike rival startups, Ike wants to be a software-as-a-service provider of self-driving tech for existing logistics operators, instead of becoming one themselves. It’ll be interesting to see how well this business model works out when competitors start offering a similar service — the biggest question is how easy or hard it’ll be for an operator to swap one self-driving SaaS out for another. If it’s easy, that’ll make for a very competitive space.
(On the disruption side: there are nearly 3 million truck drivers in the United States alone, so widespread automation here can be quite impactful. Until today, I thought trucking was the biggest profession in most US states because of this 2015 NPR article, but apparently that was based on wrongly interpreted statistics; the most common job is in retail — no surprise there. Nonetheless, trucking is currently a major profession. A decade from now it may no longer be.)
Quick productized AI links 🔌
- 🛡Facebook is increasingly talking publicly about the work it does to keep its platform safe, probably at least partially in response to the constant stream of news about its failures in this area (from Myanmar to Plandemic). This does mean we get to learn a lot about the systems that Facebook AI Research (FAIR) is building to stop viral hoaxes before they spread too widely. Examples include the recent inside look into their AI Red Team (DT #47); their Web-Enabled Simulations (WES, #38) and Tempotal Interaction Embeddings (TIES, #34) for detecting bots on Facebook; and their DeepFake detection dataset (#23). Now, Halevy et al. (2020) have published an extensive survey on their work preserving integrity in online social networks, in which they “highlight the techniques that have been proven useful in practice and that deserve additional attention from the academic community.” It covers many of the aforementioned topics, plus a lot more.
- 👨🦱 Your weekly reminder that anyone who tries to sell you a facial recognition system without any age, gender, racial or accessory biases, probably does not actually have such a system to sell to you. From the 1800 submissions to the FairFace Challenge at ECCV 2020, Sixta et al. (2020) found that: “[the] top-10 teams [show] higher false positive rates (and lower false negative rates) for females with dark skin tone as well as the potential of eyeglasses and young age to increase the false positive rates too.” I really hope that everyone deploying these systems widely is aware of this and the potential consequences.
Machine Learning Research 🎛
- 🏎 Microsoft has updated DeepSpeed, its open-source library for efficiently training massive ML models (see DT #34, #40), with four big improvements: 3D parallelism for training trillion-parameter models; ZeRO-Offload for 10x bigger model training on a single GPU; Sparse Attention kernels for 10x longer input sequences in Transformers; and 1-bit Adam for reducing network load in multi-GPU training. My work focuses on tiny models rather than large ones, so I haven’t gotten a chance to try DeepSpeed, but if any of you have, I’d love to hear about your experience!
- 🌐 Chaitanya K. Joshi wrote an essay for The Gradient where he argues that Transformers are Graph Neural Networks, equating the former’s attention mechanism to the latter’s aggregation functions. It’s a great introduction to both model types, and Joshi poses that these two subfields of machine learning can learn a lot from each other. (Also, he represents nodes in a GNN using emojis instead of letters, and references them as such in the text, which I love.) Great weekend read.
- 👩🔬 We’ve seen “tuning hyperparameters without grad students” with Dragonfly (DT #11) but… how much does a researcher’s experience actually correlate with their skills for tuning an ML model? Anand et al, (2020) investigated this and found a strong positive correlation between experience and final model accuracy, and “that an experienced participant finds better solutions using fewer resources on average.” Glad to see my skills aren’t yet completely automatable yet! (The paper is co-authored by Jan van Gemert, who was the first person to explain to me what a convolution is, in a guest lecture during my first year of undergrad. 😊)
- 🧮 The NumPy paper is out! It’s a Nature article by Harris et al. (2020). Is this going to break records in citation counts, as pretty much every machine learning paper should probably be referencing it? Either way, I’ve updated my repo of BibTex citations for Python packages to add it.
I’ve also collected all 70+ ML research tools previously featured in Dynamically Typed on a Notion page for quick reference. ⚡️
Artificial Intelligence for the Climate Crisis 🌍
- 🌳 Rana and Vashney (2020) developed ePSA (e-Plantation Site Assistant), a recommender system that “combines physics-based/traditional forestry science knowledge with machine learning” to find the optimal spots to plant trees when reforesting an area. The ML portion is an XGBoost decision tree ensemble to calculate deforestation probabilities based on historical data. After testing the system around northern India, they found that “in 26 out of 30 locations, forest officials found the recommendations of the app very useful for selecting the right site for planting trees.”
Cool Things ✨
Images shown to subjects (first column) vs. images reconstructed purely from a brain scan (third column, red). (Gaziv et al., 2020)
Today in cool stuff that’s much further along than I thought it was: Gaziv et al. (2020) developed a self-supervised approach to fMRI-too-image reconstruction — generating pictures of what your eyes are seeing from a brain scan. Their main contribution is this self-supervised aspect: they pretrain their decoder on unlabeled ImageNet data, of which there is much more available than the fMRI-image pairs that previous work exclusively used.
Their results are good enough for a human to recognize e.g. a glass of beer or an animal (see above), which were reconstructed from fMRIs without having been in either the pretraining or fMRI datasets! Check out Figures 1 and 2 in the PDF on bioRxiv for more examples.
Thanks for reading! As usual, you can let me know what you thought of today’s issue using the buttons below or by replying to this email. If you’re new here, check out the Dynamically Typed archives or subscribe below to get a new issues in your inbox every second Sunday.
If you enjoyed this issue of Dynamically Typed, why not forward it to a friend? It’s by far the best thing you can do to help me grow this newsletter. 📓