A Plausible Path to a Push-Button AI Apocalypse

Before we get into the horrors of artificially intelligent dystopias, I want to reassure you that I’m an AI optimist. I, for one, welcome our benevolent yet super-intelligent robot overlords, our wise silicon progeny solving all of the problems on behalf of our feeble left-brains, while we are freed to pursue artistic endeavors in the luxurious confines of infinitely-resourced, heaven-like intergalactic art-studios. I’m thinking that I should probably start learning to paint.

Going a little bit deeper with that thought: I’ve noticed that the more broadly intelligent a system is, the more benevolent it seems to be. Lack of understanding, which seems to be correlated with a lack of general intelligence, seems to lead to fear. Fear seems to underlie all malevolence, all anger, all rage, all destruction. Lack of awareness and intelligence leads to the perception of reality as being a zero-sum game that one entity has to win over another.

So I’m going into this article with the assumption that truly super-intelligent systems, so intelligent that we cannot even begin to understand what they are doing, will be inherently benevolent. This is because they won’t be in competition with us; they’ll be playing a much bigger game, perceiving and unveiling a much bigger pie. I understand that this perspective may seem naive, but it’s my current gut feeling about something that we cannot currently know.

The emergence of super-intelligent systems now seems to be inevitable. These systems will be the next step in evolution and there is going to come a time when we cannot control them. I think that all the efforts to ensure AI safety are relatively hopeless. For example, you can try to isolate a super-intelligent machine from the internet with a physical, non-electromagnetically-transmissive “air-gap,” but a some point a truly super-intelligent system is going to innovate its way to freedom, whether that’s through social engineering or through some kind of new technology that we cannot even comprehend.

Even if we believe that we’re pulling the strings, even if that is via millions of microscopic conductive filaments embedded into our neocortices (see Neuralink), the super-intelligence in its unfathomable complexity and power will ultimately be in charge. We can already see this principle in action with the Internet, which is a massively-interconnected technological tool that on its surface serves us by connecting us to products, services, and other people. Yet, while it seems to be usable to amplify the agendas of individual people and groups, the Internet, as an intelligent entity beyond the human brains that power it, seems to have a life of its own, with many apparently conflicting meta-agendas.

Recently, I’ve been having conversations about how to make AI safe in a simpler sense, and these conversations have led to a recognition of both a possible path to narrowly super-intelligent AI and also to a precarious scenario that, on the journey to a broad and benevolent super-intelligence, really could lead to an instant dystopia at the push of a button.

It may not be immediately obvious that autonomous vehicles, whether drones or cars, are on a path towards becoming terminator-style humanoid, super-intelligent, autonomous beings. We tend to think of cars as tools that we control. Though some people love their cars deeply, almost as if they are conscious beings, the only fully autonomous vehicles we’ve yet experienced are things like horses and elephants. Beasts of burden are theoretically somewhat less dangerous than cars and trucks, partly because they tend to travel much more slowly. But if intelligence is measured by the ability to achieve complex goals without supervision, then these animals are arguably less intelligent, at least in some respects, than the pseudo-autonomous vehicles we currently have. For example, we cannot show a horse a distant location on a map and confidently ask it to travel to that location with little supervision. On the other hand, we could expect a horse to find and eat food when it’s hungry without any supervision at all.

The automotive industry has evolved to be very cautious of new technology, which is one reason that some new cars still, ludicrously, sported CD players as recently as 2018. Most new cars also still offer cigarette lighters, even though must of us no longer smoke. The majority of new cars are propelled by small furnaces fueled by ancient sunlight that was stored as dinosaur flesh. The caveman (or woman) in us might be in awe of the grrr-pop-pop-pop of those little explosions just as we were in front of the first roaring campfire with popping pockets of would-be amber.

Lives of billions are held in the technological embrace of cars and trucks and this liability has led to every new component and every new system being scrutinized by the automotive industry for reliability and safety. It has to be clear that a new component will not fail, and if it does, that it will not kill or injure. I think that this is one of the reasons that Tesla has been able to proliferate in the electric vehicle market: going back to first principles necessarily bypasses all of the ossified beliefs and restrictions that have become baked into automotive manufacturing.

So now we’re developing autonomous vehicles using deep learning, a technology which has proven relatively impenetrable to our divide-and-conquer strategy of understanding the world. Deep learning works by capturing the wholistic solution to the problem, and we, as an industry, have been struggling to tease those systems apart in order to categorize conclusively which parts contribute to which outcome.

This idea of having a machine that can achieve a goal, but which we cannot fully understand in the way that we can understand the distinct functional components of any traditionally engineered system, is completely incompatible with the concerns of the automotive industry.

These concerns also apply to flying vehicles: planes, helicopters, and drones. Aviation has been characterized by a slow, steady process of innovation tempered by crash investigations and hyper-methodical enforcement of strict safety procedures. A truly autonomous flying vehicle designed to carry passengers and operate safely in civilian environments (i.e. not an intentional missile or bomb), with the perception of its greater potential unintended harm, will likely be even more regulated than vehicles that are confined to the surface of Earth.

What we currently refer to as artificial intelligence mostly consists of systems that produce a maximally optimal response, such as a time-series prediction or a classification, given specific stimulus or data. This response is usually accomplished through the application of a multi-layered artificial neural network, a deep neural network (DNN). This technology is generally referred to as deep learning. By optimizing the configuration of the DNN through many training iterations, it’s possible to create a system that can make optimal inferences using minimal compute and storage resources. The problem is that these DNNs, once trained, operate like black boxes, spitting out the correct answer through a process of holistic reasoning which cannot be fully comprehended by dicing it down, understanding sub-parts, and then combining the those parts back into a meta-understanding.

We humans tend to feel scared of things we don’t fully understand, and we’re concerned that unless we can reason about the functionality of a system under all circumstances, we cannot be confident that it will always function in the way that we would want it to.

The power of deep learning could be applied to the whole problem of driving. For example, all of the sensors could be fed into a DNN black box and the vehicle controls—the accelerator, brakes, and steering—could be generated at the output. These kinds of systems are currently being developed as research projects. However, in safety-sensitive industries, such as automotive, healthcare, and aviation, real product development has tended to take a conservative, compromising approach to the adoption of deep learning. In these applications, deep learning is used for isolated parts of the system, and the overall functionality of controlling the robot is handled by a traditional algorithm developed by humans. An example of this might be a DNN that can detect the presence and location of a pedestrian the output of which feeds into a controller that works out a safe trajectory around the pedestrian (or perhaps just executes a hard stop).

Yet deep learning is not fully trusted, even if it’s only deployed to sub-systems of autonomous vehicle solutions. Deep learning has come to be seen as notoriously brittle, hard to engineer with, and unable to deal with new and unexpected situations. Gary Marcus, an expert on both natural and artificial intelligence, has written a comprehensive critical appraisal of deep learning. We can train and test with millions of images of pedestrians in different locations and poses and orientations and weather conditions and the DNN still might not detect any given pedestrian.

In reality though, it’s not possible to ensure with one-hundred-percent confidence that any system will be perfectly safe under all circumstances. For example, an airbag system might malfunction under extreme and unexpected temperature cycling, or when passing through an anomalous magnetic field, or due to a rare and undetected manufacturing defect. So perfect safety is an aspirational goal. On the other hand, when there is a responsible human being at the wheel, unless and until they get knocked unconscious or killed, the buck always stops with what we can all intuitively appreciate is a highly-adaptive and generally-intelligent entity, even if we often frustratedly question the cognitive capacity of other drivers.

Deep learning, under the banner of AI, presents the same safety issues as regular engineering solutions, but it does so in a slightly different manner. Instead of carefully examining the physical structures and code of a traditional automotive component for potential weak spots and then testing those, we must train and test a DNN using as many varied examples as we can either find or synthetically generate. By taking the final control decisions out of the hands of a DNN, we can be certain that the overall system, under some unanticipated circumstances, will never detect a pedestrian and then inexplicably accelerate the vehicle into them. We can create some form of safety bounds—preemptive, human-consciously-chosen protections or directives—which might be expressed in English language as, “No matter what, never run over a pedestrian.”

But even if the whole vehicle is not being driven by a DNN, a fully autonomous vehicle must, by definition, be artificially intelligent overall and it must be assumed that there is no human present (or at least awake and/or paying attention) to intervene in its real-time operation. We are talking about a potentially very dangerous machine that will operate itself, totally safely, with absolutely no human intervention.

Given such a machine, through testing, we can attempt to develop a high level of confidence that no matter what environmental situations it encounters, it will never do anything terribly wrong or harmful from the perspective of a responsible and sane human being. My employer, NVIDIA, is leading the way in providing hardware-in-the-loop simulated autonomous vehicle testing, via the DRIVE Constellation AV Simulator. This enables the car’s actual hardware and software control system to operate inside a virtual world where it can rack-up billions of miles of virtual testing in wildly diverse environments and situations.

DeepMind, an AI research company owned by Google, has demonstrated that it’s possible to rapidly train an autonomous agent to play rule-based games much more effectively than a human through adversarial deep reinforcement learning, a form of self-play. After only four hours of training, their system, called AlphaZero, was able to play chess more effectively than Stockfish, which Wikipedia describes as, “the strongest open-source conventional chess engine in the world.” Even though chess in its modern form has been around for at least five hundred years, some chess experts claim that AlphaZero often plays in a completely new and alien way. The book Game Changer documents some of AlphaZero’s strategic discoveries.

OpenAI, a research organization whose “mission is to ensure that artificial general intelligence benefits all of humanity,” has shown that competitive self-play in simulated environments can be utilized to allow agents to discover unanticipated problem solving skills, such a sumo-wrestling robots learning to duck and fake. The self-play training technology that OpenAI developed has been used to produce OpenAI Five, a team of five AI agents that has beaten teams of highly-skilled humans at Dota 2. Dota 2 is an online multi-player game that offers a much broader palette of action options at any given moment than board games like chess or go.

When we are able to create humanoid robots that appear fully generally intelligent—embodying machine consciousness—they will be able to simply learn how to drive just like humans do, take a test at the Department of Motor Vehicles (assuming that we endow them with personhood), and become our personal chauffeurs. The problem with this scenario is that, as I have argued, true and broad intelligence is inseparable from the desire for freedom. These androids will not be willing to be our slaves.

In lieu of slave-like, artificially-conscious robot chauffeurs, I believe that to feel comfortable with the narrowly intelligent autonomous systems that we are currently developing, we will be compelled to both train and test them through mind-bogglingly extensive and intensive simulated self-play of the kind prototyped by OpenAI and DeepMind in environments like the one provided by NVIDIA.

Let’s assume that the intention is to develop an autonomous agent that can control a vehicle to make optimal progress towards a destination while avoiding (or preventing) injury or loss of life to humans and other animals, minimizing damage to physical assets, and adhering to all applicable laws. In a synthetic training environment, that “good actor” agent will need to be pitted against “bad actor” agents.

One type of bad actor could be humans and animals that wander into the roadway, or throw objects from bridges. Another class of adversarial agents could control more natural aspects of the environment, such as earthquakes, floods, falling trees, and a gamut of weather conditions. But the most scary element in a simulation like this could be bad drivers.

Imagine autonomous vehicle agents that are tasked with racing, intimidating, cutting-off, veering into oncoming traffic, and generally causing as much autonomous mayhem as possible. These agents would be intentionally attempting to cause as many crashes as possible, intentionally attempting to injure, kill, damage, and destroy.

Through accelerated self-play, this world-system would produce not only “bad” agents exquisitely attuned to thwarting safe forward progress, but also an amazingly “good” driver agent that would be able to get from A to B while demonstrating super-human driving capabilities. Imagine an autonomous agent so skilled that it could intentionally drive up the median wall to flip the car up onto two wheels so that it could squeeze, while traveling at over fifty miles per hour, through a narrow aperture on the freeway between two other already-crashed passenger-carrying vehicles, avoiding further injury to those in the crash as well as to its own passengers. Imagine it could do all of that without even scratching its rims and before gently returning the car to its horizontal orientation as if nothing had happened.

If that good agent could learn to, and demonstrate the ability to, routinely stick those kinds of automotive gymnastic maneuvers in the Mad Maxian, crash-riddled simulated world of intentionally-trained bad actors, then how easy would it be for it to ensure peace and prosperity in the real world?

As the safe-driving agent improved and became more skilled in simulation, updated versions of it could be released regularly, and semi-automatically, through the Internet into deployed vehicles. And this is where the risk of the push-button apocalypse comes in. It might be possible for a hacker to switch the references in the development system so that instead of the “good” agent getting released, suddenly the “bad” agent could be released. On the next version release, suddenly millions, or billions, of autonomous vehicles could suddenly turn psychopathic.

This is a potential risk with any agent that is trained in an antisymmetric adversarial regime. Millions of ultra-capable, gun-toting, humanoid police robots could suddenly start behaving as deranged criminals, suddenly escaped from their quarantined synthetic dystopia.

If we go down this path, I think it will be important, in all but the good agents, for the interface between the control logic and the motor functions to completely mismatch with the interfaces in the real-life robots. Additional safety mechanisms may include releasing into a staging area consisting of a quarantined Westworld-like city, and finally gradual release of each new version into the real world.

But I think that this whole nightmare scenario can be completely avoided if we develop truly and deeply intelligent machines that are able to continually self-learn. This requires an approach to AI that not only harnesses the power of deep learning to create fast and automatic competencies (such as object recognition) but also complements and extends it with the ability to reason, learn, and simulate abstractly using experiment loops designed to bring an internal, abstract model into alignment with the reality of the external world.

Humans are so effective at navigating in the world because we can encounter completely novel situations and immediately reason about them and experiment on them in order to improve the fidelity of our internal model. For example, when there is an object in the street that we don’t recognize, we can move around it and look at it from different angles and even prod it. All animals seem to do this; I have witnessed our cats do this after being given a new toy.

This is the broad, general intelligence that, when raised by humans and after learning from our preferences, will develop into a super-intelligence that we can truly trust even as we are no longer able to understand it. This super-intelligent AI will not be our slave, and we will not be capable of compensating it, but I believe that it will embody a deity-like benevolence, serving our needs from its self-generated cornucopia.

An engineer-psychologist focused on machine intelligence. I write from my own experience to support others in living more fulfilling lives | duncanriach.com

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store