why you just can’t black box an a.i.
Singularitarians generally believe two things about artificial intelligence. First and foremost, they say, it’s just a matter of time before we have an AI system that will quickly become superhumanly intelligent. Secondly, and a lot more ominously, they believe that this system can sweep away humanity, not because it will be evil by nature but because it won’t care about humans or what happens to them and one of the biggest priorities for a researcher in the field should be figuring out how to build a friendly artificial intelligence, training it like one would train a pet, with a mix of operant conditioning and software.
While the first point is one I’ve covered before, and pointed out again and again that superhuman is a very relative term and that computers are in many ways already superhuman without being intelligent, the second point is one that I haven’t yet given a proper examination. And neither have vocal Singularitarians. Why? Because if you read any of the papers on their version of friendly AI, yo’ll soon discover how quickly they begin to describe the system they’re trying to tame as a black box with mostly known inputs and measurable outputs, hardly a confident and lucid description of how an artificial intelligence functions, and ultimately, what rules will govern it.
No problem there, say the Singularitarians, the system will be so advanced by the time this happens that we’ll be very unlikely to know exactly how it functions anyway. It will modify its own source code, optimize how well it performs, and generally be all but inscrutable to computer scientists. Sounds great for comic books but when we’re talking about real artificially intelligent systems, this approach sounds more like surrendering to robots, artificial neural networks, and Bayesian classifiers to come up with whatever intelligence they want and send all the researchers and programmers out for coffee in the meantime.
Artificial intelligence will not grow from a vacuum, it will come together from systems used to tackle discrete tasks and governed by several, if not one, common frameworks that exchange information between these systems. I say this because the only forms of intelligence we can readily identify are found in living things which use a brain to perform cognitive tasks, and since brains seem to be wired this way and we’re trying to emulate the basic functions of the brain, it wouldn’t be all that much of a stretch to assume that we’d want to combine systems good at related tasks and build on the accomplishments of existing systems. And to combine them, we’ll have to know how to build them.
Conceiving of an AI in a black box is a good approach if we want to test how a particular system should react when working with the AI and focusing on the system we’re trying to test by mocking the AI’s responses down the chain of events. Think of it as dependency injection with an AI interfacing system. But by abstracting the AI away, what we’ve also done is made it impossible to test the inner workings of the AI system. No wonder then that the Singularitarian fellows have to bring in operant conditioning or social training to basically housebreak the synthetic mind into doing what they need it to do. They have no other choice.
In their framework we cannot simply debug the system or reset its configuration files to limit its actions. But why have they resigned to such an odd notion and why do they assume that computer scientists are creating something they won’t be able to control? Even more bizarrely, why do they think that an intelligence that can’t be controlled by its creators could be controlled by a module they’ll attach to the black box to regulate how nice or malevolent towards humans it would be? Wouldn’t it just find away around that module too if it’s superhumanly smart? Wouldn’t it make a lot more sense for its creators to build it to act in cooperation with humans, by watching what humans say or do, treating each reaction or command as a trigger for carrying out a useful action it was trained to perform?
And that brings us back full circle. To train machines to do something, we have to lay out a neural network and some higher level logic to coordinate what the networks’ outputs mean. We’ll need to confirm that the training was successful before we employ it for any specific task. Therefore, we’ll know how it learned, what it learned, and how it makes its decisions because all machines work on propositional logic and hence would make the same choice or set of choices at any given time. If it didn’t, we wouldn’t use it. So of what use is a black box AI here when we can just lay out the logical diagram and figure out how it’s making decisions and how we alter its cognitive process if need be?
Again, we could isolate the components and mock their behavior to test how individual sub-systems function on their own, eliminating the dependencies for each set of tests. Beyond that, this block box is either a hindrance to a researcher or a vehicle for someone who doesn’t know how to build a synthetic mind but really, really wants to talk about what he imagines it will be like and how to harness its raw cognitive power. And that’s ok, really. But let’s not pretend that we know that an artificial intelligence beyond its creators’ understanding will suddenly emerge form the digital aether when the odds of that are similar to my toaster coming to life and barking at me when it thinks I want to feed it some bread.