why cognitive computers come with a drawback
Lately there’s been a huge buzz about cognitive computing, computers built to work like organic neurons and instead of being programmed to carry out new tasks, they learn through trial and error. So the idea is that they will be able to handle fuzzy logic problems, be much more energy efficient, and solve problems faster. But the concern no one seems to talk about is the major downside of departing from the von Neumann architecture we know and love today. Rather than simply being able to load new software onto a cognitive machine, or just have it learn something new, it would have to be retrained and its previous task would be overridden. In some ways, we’d be going back to the very first computers, pre-wired to do certain computations and requiring their programmers to swap out hardware and reconnect cables and vacuum tubes to perform new tasks all while overriding what they did before. At the same time, we’d be gaining more efficient and easier to refine purpose built, lightning fast analytic engines requiring very little to no code changes to adjust for a new challenge.
All right, so what does this mean in perspective? It means that your computer is very, very unlikely to outwardly change for the foreseeable future. It may use magnetic CPUs, but it will still run an operating system, have an internal hard drive to store all your files, and you’ll still be using web browsers or web-enabled applications to do what you’ll do every day. It will also most likely be a tablet or tablet-like since this UI paradigm has taken off like a rocket right now thanks to its intuitive nature. But it probably won’t boast chips designed to mimic neural pathways because that would be rather impractical. An operating system already knows where your files live, how to get to them quickly, and how to load them into a user-friendly format. A hardware-based artificial neural network (or ANN) would have to either relearn its index tables, or add new nodes every time a file is added or deleted or moved. If that seems inefficient, it is. Cognitive chips are not built for that. They’re built to do what a typical computer can’t do, or can’t do in a timely manner, and speed up complex classification tasks and large statistical models. Right now they’re classifying handwriting and playing Pong, but the architecture to do a lot more is there. They just need to be trained to do the tasks over several tens of thousands of cycles.
However, once you train them for a task, that’s pretty much it for those chips. To understand why, consider the layout of an ANN. Imagine a maze of logic gates forming radiating pathways between each other. During their learning cycles, each input going into these logic gates is given a certain value, a numerical weight. If the sum of the values of the incoming signals exceeds a determined threshold, the logic gate fires off a binary signal which is treated like an input with a certain value in another logic gate which performs the same process and so on, and so forth until we reach the final logic gates the output of which determines the chips’ decisions. As the ANN learns, the input values are pruned and refined to make sure that the inputs are dealt with properly, a lot of corrections and averaging goes on for each logic gate and each input, and when those inputs are finally set, changing them will invalidate the ANNs’ training. If you want it to learn more, you could add new gates for processing additional related information and re-train the ANN, something that’s quite easy to do on a typical von Neumann computer since these logic gates are virtual. When we switch to hardware, we need to add the logic gates physically and in certain locations along the chip. And that’s not an easy task at all.
Our brains manage to do this because we have hundreds of billions of neurons constantly reinforced by work with outside stimuli and purged of unnecessary information by forgetting it. We learn and forget and our brain has plenty of room to spare because it prioritizes our knowledge and always has excess capacity thanks to a vast set of neuron clusters networked together into breathtakingly complex interacting systems. But again, as mentioned several times before, it’s not the size of the networks that count, it’s the connections. If you put together a giant cognitive chip with 300 billion logic gates, you won’t necessarily be able to get a useful result out of it. All of those 300 billion logic gates will train for the same task because there’s no differentiation that’s going to affect only a certain cluster of them and you’ll more than likely end up with an underfitted network since so many inputs make catching and correcting all the errors pretty much impossible. How to differentiate an ANN into structured clusters is actually an area with which I’m dealing, but more on that later, hopefully with an application or two with which you could experiment. The takeaway here is that doing things like expanding or differentiating an artificial neural network is easy virtually and very hard in a chip.
So where does this nitpicking leave us? Well, it leaves us with the idea to create purpose-built ANN chips for highly specialized computing and the problem of how to expand them since unlike today’s computers, they’re not going to be able to simply load a new program unless there’s a storage device which switches between a set of different neural networks, which while handy, would still limit the chip’s abilities. If IBM, the leader in the field right now, is planning to do that, I don’t know because they haven’t released much detail. And since we’re speaking of IBM, we should note that the person in charge of these efforts is Dharmendra Modha, who tends to make very big claims and inspire very inaccurate and misleading coverage on pop sci news sites. His cat brain simulator was actually little more than an ANN white-noise machine, and unlike MIT Tech Review has claimed in the first link of this post, it was by no means a preview of what these cognitive chips will do. Don’t take this the wrong way, Modha may well be on to something with this, but his track record does make me a little skeptical. It would be nice to see some of the work being done by his team in more detail because it is a fascinating area of research, but since he’s actually trying to sell it to DARPA and IBM’s high end customers, it seems unlikely that we’ll see a torrent of detailed papers on the subject anytime soon.