Archives For artificial neural networks

designer v. developer

After some huffing, puffing, and fussing around with GitHub pages and wikis, I can finally bring you the promised first installment of my play-along-at-home AI project in which there’s no code to review just yet, but a high level explanation of how it will be implemented. It’s nothing fancy, but that’s kind of the point. Simple, easy to follow modules are easier to deal with and debug so that’s the direction in which I’m headed as they’ll be snapped on top of cloud service interfaces which will provide the on-demand computing resources required as it ramps up. There are also explanations for some of the choices I’m making when there are several decent implementation options for a particular feature set, some of which more or less have to come down to personal preference, while there are some long-view reasons to definitely pick one over the other.

In the next update, there will be database designs and SQL which may look a like overkill for a framework to run some ANNs, particularly when there are hundred line Python scripts that run them without a hiccup. But remember that for what’s being built, ANNs are just one component so the overhead is based on managing where it goes and securing the information because if I learned anything about security is that if it’s not baked in from the start but layered on top after all the functionality has been completed, you end up with only one layer of defense that may be easily pierced by exploiting a vulnerability out of your control. Inputs may not get sanitized with proper care, your framework package for CSRF prevention might not have been updated, and without a security model to put up some roadblocks between a hacker and your data, you may as well have not bothered. Likewise, there’s going to be a fair amount of code and resources to define the ANNs inputs and outputs so we can actually harness them to do useful things.

curious bot

Defense contractor Raytheon is working on robots that can talk to each other by tweaking how machine learning is commonly done. Out with top-down algorithmic instructions, in with neural network collaboration and delegation across numerous machines. Personally, I think this is not just a great idea but a fantastic one, so much so that I ended up writing my thesis on it and had some designs and code laying around for a proof of concept. Sadly, it’s been a few years and I got side-tracked by work, my eventual cross-country move, and other pedestrian concerns. But all that time, this idea just kept nagging me, and so after reading about Raytheon’s thoughts on networked robotics, I decided to dust off my old project and build it anew with modern tools in a series of posts, laying out not just the core concepts but the details of the implementation. Yes, there’s going to be a lot of in-depth discussion about code, but I’ll do my best to keep it easy to follow and discuss, whether you’re a seasoned professional coder, or just byte-curious.

All right, all right, that’s enough with all the groaning, I design and write software for a living, not pack comedy clubs in West Hollywood. And before you write any software, you have to lay out a few basic goals for what you want it to do. First and foremost, this project should be flexible and easily expandable because all we know is that we’re going to have neural networks that will run for machines with inputs and outputs, and we want to tie them to a certain terminology we could invoke when calling it. Secondly, it should be easily scalable and ready for the cloud, where all it takes to ramp it up is tweaking a few settings on the administration screen. Thirdly, it should be capable of accepting and executing custom rules for making sure the digital representations of the robots in the system are valid on the fly. And finally, it should allow for custom interfaces to different machines inhabiting the real world, or at least get close enough to providing a generic way to talk to real world entities. Sounds pretty ambitious, I know, but hey, if you’re going to be dealing with artificial intelligence, why not try to see just how far you can take an idea?

Before we proceed though, I’d like to tackle the obvious question of why one would want to dive into a project like that on skeptical pop sci blog. Well, for the last few years artificial intelligence has figured in popular science news as some sort of dark magic able to create utopias and ruin economies by making nearly half of all jobs obsolete in mere decades by writers who can’t fact check the claims they quoted and use to build elaborate scenarios of the future. But even if you don’t dive into the code and experiment with it yourself, you’ll get a good idea what AI actually is and isn’t. Then, the next time you read some clickbait laying out preposterous claims about how robots will take over the world and enslave us as we remain oblivious to it, you could recall that AI isn’t a digital ghost from sci-fi comic books, waiting to turn on humanity it comes to resent like the hateful supercomputer of I Have No Mouth and I Must Scream, but something you’ve seen diagrammed and rendered in code you can run on your very own computer on an odd little pop sci blog, and feel accordingly unimpressed with the cheap sensationalism. So with that in mind, here’s your chance to stop worrying to learn to understand your future machine overlords.

Here how this project is going to work. Each new post in this series is going to point to a GitHub wiki entry with code and details to keep the code and in-depth analysis in the same place while the posts here will give the high level overview. This way, if you prefer to stick to very high level basic overviews, that’s what you get to see first because as I’ve been told by so many bloggers who specialize in popular science and technology, big blocks of math and code are guaranteed to scare off an audience. But if the details intrigue you and you wanted a better look under the hood, it’s only a link away, and even though it looks scary at first, I really would encourage you to click on it and try to see how much you can follow along. Meanwhile, you’ll still get your dose of skeptical and scientific content in between so don’t think Weird Things is about to turn into an a comp sci blog the whole time this project is underway. After all, after long days of dealing with code and architectural designs, even someone who can’t imagine doing anything else will need a break from talking about computers and writing even more code for public review…

android mind

For those who are convinced that one day we can upload our minds to a computer and emulate the artificial immortality of Ultron in the finest traditions of comic book science, there’s a number of planned experiments which claim to have the potential to digitally reanimate brains from very thorough maps of neuron connections. They’re based on Ray Kurzweil’s theory of the mind; we are simply the sum total of our neural network in the brain and if we can capture it, we can build a viable digital analog that should think, act, and sound like us. Basically, the general plot of last year’s Johnny Depp flop Transcendence wasn’t built around something a room of studio writers dreamed up over a very productive lunch, but on a very real idea which some people are taking seriously enough to use it to plan the fate of their bodies and minds after death. Those who are dying are now finding some comfort in the idea that they can be brought back to life should any of these experiments succeed, and reunite with the loved ones who they’re leaving behind.

In both industry and academia, it can be really easy to forget that the bleeding edge technology you study and promote can have a very real effect on very real people’s lives. Cancer patients, those with debilitating injuries that will drastically shorten their lives, and people whose genetics conspired to make their bodies fail them, are starting to make decisions based on the promises spread by the media on behalf of self-styled tech prophets. For years, I’ve been writing a lot of posts and articles explaining exactly why many of these promises are poorly formed ideas that lack the requisite understanding of the problem they claim to understand how to solve. And it is still very much the case, as neuroscientist Michael Hendricks felt compelled to detail for MIT in response to the New York Times feature on whole brain emulation. His argument is a solid one, based on an actual attempt to emulate a brain we understand inside and out in an organism we have mapped from its skin down to the individual codon, the humble nematode worm.

Essentially, Hendricks says that to digitally emulate the brain of a nematode, we need to realize that its mind still has thousands of constant, ongoing chemical reactions in addition to the flows of electrical pulses through its neurons. We don’t know how to model them and the exact effect they have on the worm’s cognition, and even with the entire immaculately accurate connectome at hand, he’s still missing a great deal of information on how to start emulating its brain. But why should we have all the information, you ask, can’t we just build a proper artificial neural network reflecting the nematode connectome and fire it up? After all, if we know how the information will navigate its brain and what all the neurons do, couldn’t we have something up and running? To add on to Hendricks’ argument that the structure of the brain itself is only a part of what makes individuals who they are and how they work, allow me to add that this is simply not how a digital neural network is supposed to function, despite being constantly compared to our neurons.

Artificial neural networks are mechanisms to implement a mathematical formula for learning an unfamiliar task in the language of propositional logic. In essence, you define the problem space and the expected outcomes, then allow the network to weigh the inputs and guess its way to an acceptable solution. You can say that’s how our brains work too, but you’d be wrong. There are parts of our brain that deal with high level logic, like the prefrontal cortex which helps you make decisions about what to do in certain situations, that is, deal with executive functions. But unlike artificial neural networks, there are countless chemical reactions involved, reactions which warp how the information is being processed. Being hungry, sleepy, tired, aroused, sick, happy, and so on, and so forth, can make the same set of connections produce different outputs from very similar inputs. Ever had an experience of being asked to help a friend with something until one day, you got fed up that you were being constantly pestered for help, started a fight, and ended the friendship? Humans do that. Social animals can do that. Computers never could.

You see, your connectome doesn’t implement propositional calculus, it’s a constantly changing infrastructure for exchanging basic functionality, deeply affected by training, injury, your overall health, your memories, and the complex flow of neurotransmitters floating between neurons. If you bring me a connectome, even for a tiny nematode, and told me to set up an artificial neural network that captures these relationships, I’m sure it would be possible to draw up something in a bit of custom code, but what exactly would the result be? How do I encode plasticity? How do we define each neuron’s statistical weight if we’re missing the chemical reactions affecting it? Is there a variation in the neurotransmitters we’d have to simulate as well, and if so, what would it be and to which neurotransmitters will it apply? It’s like trying to rebuild a city with only the road map, no buildings, people, cars, trucks, and businesses included, then expecting artificial traffic patterns to recreate all the dynamics of the city the road map of which you digitized, with pretty much no room for entropy because it could easily break down the simulation over time. You will both be running the neural network and training it, something it’s really not meant to do.

The bottom line here is that synthetic minds, even once capable of hot-swapping newly trained networks in place of existing ones, are not going to be the same as organic ones. What a great deal of transhumanists refuse to accept is that the substrate in which computing — and they will define what the mind does as computing — is being done, is actually quite important because it allows the information to flow at different rates and in different ways than another substrate. We can put something from a connectome into a computer, but what comes out will not be what we put into it, it will be something new, something different because we put in just a part of it into a machine and naively expected the code to make up for all the gaps. And that’s for a best case scenario with a nematode and 302 neurons. Humans have 86 billion. Even if we don’t need the majority of these neurons to be emulated, the point is that whatever problems you’ll have with a virtual nematode brain, they will be more than nine orders of magnitude worse in virtual human ones, as added size and complexity create new problems. In short, whole brain emulation as a means for digital immortality may work in comic books, but definitely not in the real world.

If you’ve been reading for a while and cared, you might remember an old post about a white paper I uploaded on arXiv outlining preliminary designs of a swarm control framework using artificial neural networks, and my follow up which promised to release the code for testing its core concepts. Well, as you saw in my last post if you clicked the link, the code is now out on GitHub as promised. It includes database scripts, libraries, and automated tests using a mocking framework which will show whether your modifications fit with existing bits of logic. With all due respect to a lot of academic apps, I’ve seen quite a few examples of scripts that were extremely difficult to test and changing any bit of code meant that every last method had to be checked and re- checked to make sure you didn’t wreak havoc on what the app was doing. To avoid such unpleasantries with Hivemind, I used a common language on a widely available platform to make it easier to run and modify, and an industry-standard IoC container and mocking library. But I’m starting to get ahead of myself here so it may be a good idea to slow down and review what this version of Hivemind is and isn’t, and where it’s headed.

What it is. A starter kit with the basic tools containing Hivemind’s core logic and methods to collect the data it will need to do its high level work. Basically, something like 95% of the code you’ll see is CRUD, as with most management tools, while the other 5% encodes and decodes artificial neural networks in a way that frees the logic they contain from their initial implementation and prepares them to link to your robot’s various motors or sensors, and sorts through your active robots to find the machines that will be best suited for a particular task or a set of tasks. There are a lot of wrappers and adapters, and some programmers, especially the academic ones, would probably think that it’s rather overbuilt. And that’s probably a fair evaluation. However, it’s not a full and standalone product, but rather a structured guide for further development and refinement, which is why all the i’s are formally dotted and t’s very officially crossed to show those who will be working with the code how a particular object should be serialized and how certain methods need to be implemented. And it’s actually still performant, with an average first call time of 500 milliseconds, and "warm" call time of 90 milliseconds.

What it is not. An application that will let you plug any robot into it and command your robot horde at will. Yes, in due time, it should be able to do that, but more on this later. One of the big problems with robots is that very few of them have standardized APIs or an adapter you can simply plug into your software and run. Most need very specific sequences of bytes sent to their motors and sensors, and while Hivemind will provide the ability to format any bytecode and associate it with a given well-defined command, it can’t provide the bridge to send these bytecodes to just any machine out there with a little tweaking. If it did, it would lock the developers into a certain platform (which is contrary to my goal here) or require that I write adapters to talk to every commercially widespread robot out there, a major challenge considering that very few bot makers make their APIs available and if they do, make them available in a format with which they’re comfortable rather than the raw bytecodes I would need to make integration and communication easier. In my research, I found a nifty idea to use a rather simple Javascript function to manipulate robots from Willow Garage, but it seems like this idea died about as soon as the last line of Python was written. Something like that could’ve been a huge help, but oh well…

So what’s next? Just like I keep saying, Hivemind is a starter kit. But it’s not just a starter kit for others since it will be used as the base for a new project, one intended to make Hivemind into a usable product and expand on what it offers so far. So far, I pruned the code in the kit, added support for parallelization of large queries to boost performance, and expanded it to include logging and carry additional useful data about the machinery, motors, and sensors that it will manage. Next, I want to add geo-tracking, routing, and an API of its own so an advanced user could script and chain the tasks a robot swarm may be tasked with doing. Rather than doing it all and releasing the whole thing as with Hivemind, I will be uploading the core functionality, then follow it with expansions in the form of new services and DLLs designed for plug-and-play integration with the core. As you can probably guess, there will be a white paper on this new system as well, and to keep things interesting, as it comes together, I’ll be posting some ideas as for how you could use software like it to create a robot army in space, like the mechanical swarm described in the sci-fi draft I posted a while back.

So if you’re reading all this and yawning, wondering why you should care if you’re not a programmer and have no interest in this sort of thing, if the promises of a super-villain style guide to robotic domination don’t get you excited about AI, you better check your pulse. At the end of all this work, there’s code to help you command an artificial collective consciousness, a sci-fi book, and a guide to robot-assisted domination of space in it for all of you, serialized in part here and in part on arXiv. Just please, don’t do anything too crazy with it. And if you do, tape it and post it on YouTube to inspire our fear/admiration, whichever best applies…

Long time readers probably noticed that the last month was a little off. Posts weren’t coming as per the blog’s natural rhythm and the annual April Fools gag was also absent. But there was a good reason for this, one I’d be happy to share with you if it wasn’t for the fact that you cannot use your own blog to shamelessly plug your stuff or the blogging police will come to your house. Wait, what? You totally can? And there’s no definitive body of blogging standards looking after you? Well, in that case, here’s one project I’ve been trying to eke out a few minutes here and there to finish, a software kit to managing random robots and artificial neural networks they would use to detect and respond to stimuli called Hivemind. Instead of being custom tailored to any particular robotics platform and meant to make a specific machine or two more autonomous, Hivemind was built on the idea of having a small swarm of bots ready to do your bidding and organized by what you’ve taught them to do so the right ones can be selected for a task you have in mind. Of course this is still a work in progress, but the basics of maintaining all the necessary information and the libraries for a complete API are almost there.

While looking for a topic for my thesis project in grad school, I came across many different ideas for how one could work with robots, ranging from various applications of graph theory to individual machines which would then figure out who’s coming and who’s going, to using robots as web services, something touted by PopSci as a groundbreaking project but in reality, abandoned in the ROS open source repository as an experimental library, not guaranteed to work. Hivemind is designed to take a step back, answer the question of what you’re trying to get the robot or robots to accomplish, and then select the right bots for the job. I’m hoping that with an adequate amount of time and feedback, it could even be used to recommend robot configurations but now it’s still all about refining the basics and making sure the underlying structure works smoothly and can provide an honest to goodness framework for training, experimenting with, and managing robot swarms. It doesn’t train bots on its own because there are a lot of ANN packages out there used by a lot of researchers and I doubt I’d make them switch over to something completely new. Instead, they could simply export their ANN’s data into a string-based format for Hivemind, outlined in the paper, and plug it into the framework as a new asset.

Ultimately, this is something I’d like to finish polishing and post on GitHub for beta testing to collect feedback from anyone who’d like to use it to have their trained robots tool around showing off what they do, or to find out how well their neural networks perform in the real world. Sad part is that because there’s no standardized set of libraries for communicating with all robotic platforms, the users would have to either write their own, or use utilities provided with their machines. For its part, Hivemind would let them correctly format their commands to be sent, and hook up the neural network outputs to the right commands via a utility library. Meanwhile, in case you’re wondering, this will be submitted to a peer-reviewed journal as soon as its pared down to the template required by the journal I’m targeting. Even in computer sciences it can still take months between submissions and being told whether the paper was accepted or not, so while that’s going on in the background, I figure that there’s nothing to lose from posting a preprint and refining the project. If anything, comments, questions, and critiques from those interested in the research would only make it a better tool. Oh and for those of you who’d like to try it out when it’s posted but are horrified at the idea of running it on Windows 7, look into Mono

See: Fish, G. (2012). Managing artificial neural networks with a service-based mediator arXiv: 1204.0262v1

When wondering whether artificial intelligence might need a therapist, I was mostly joking. After all, why do dispassionate machines need someone to help sort out their emotions when they have none? But it appears that behavioral therapist Andrea Kuszewski not only thinks that robots may need phychologists, but that she would be perfect for the job because she worked with autistic children and apparently, machines think like an autistic child. Ok, that’s a new one. Points for originality there but it looks like we really can’t award anything on the technical side of the question because it seems quite apparent that Kuszewski is not familiar with how an artificial intelligence learns or with the basics of computing, which would be a major handicap for an aspiring computer lobotomist. Granted, not having the level of professional familiarity with autism I can’t really dispute her analogy of the way autistic children think with anything more than pointing out that kids with autism just so happen to produce emotional responses and certainly seem to be capable of creativity and profound thought rather than just memorizing answers to questions and regurgitating them on cue, as she describes in a story of one of her patients, an autistic boy who was convinced that his brain worked just like a computer…

He was no longer operating on an input-output or match-to-sample framework, he was learning how to think. So the day gave me a completely novel, creative, and very appropriate response to a question followed by the simple words, “My brain is not like a computer”, it was pure joy. He was learning how to think creatively. Not only that, but he knew the answer was an appropriate, creative response, and that — the self-awareness of his mental shift from purely logical to creative — was a very big deal. My experience teaching children with autism to think more creatively, really got me to reverse engineer the learning process itself, recognizing all the necessary components for both creativity and increasing cognitive ability.

This description of machine logic would be just fine if she didn’t try to then apply it to how artificial intelligence actually works and make a hard distinction between pure logic and creativity. Logic can certainly be creative if applied to certain contexts where there are numerous solutions to a problem and none of them is evaluated to be any more correct than the others. While propositional logic has many rules, nowhere does it insist that an answer must be binary or that there must only be one answer. In fact, you can set up logical problems where matching one of an entire set of acceptable solutions will evaluate as a correct answer and train any artificial neural network to strive towards one of these correct answers. Depending on how you teach it and its size, it’s entirely feasible that you can teach it several ways to solve the same problem and which one it will choose is going to depend on the inputs it receives, i.e. the context of the problem. And there you go. Pure logic has now been made context aware and somewhat creative, especially when we start looking at behaviors you will see when applying those neural networks to real world problems with small robots in experiments.

Creativity, for computational purposes, could be defined as successfully accomplishing an objective with given resources in more than just one way. In the classic textbook example of the concept, if we have a brick and an enemy to harm with said brick, anything from throwing it at him, to grinding it up in his food, or even using this brick to destroy something he really likes (no one says that harm has to only be physical, right?) are all on the table, and all valid outcomes. So when evaluating creativity per se, what we’re really doing it seeing how much variation is demonstrated by our subject in the absence of limiting factors. The same applies to many creative humans as well. I’m sure you may have found yourself in a job where management dead set on consistency and predictability take away the ability to introduce new ideas or deviate from a script. No one can be creative in an environment like that and will revert to an input-output framework. But considering that we tend to give an artificial mind as much leeway as possible to see if our learning algorithms will spawn creativity, Kuszewski’s conception of how she could help with machine learning using her background utterly baffling…

If children are left to learn without any assistance or monitoring for progress, over time, they could run into problems that need correcting. Because our AI learns in the same fashion, it can run into the same kinds of problems. When we notice that learning slows, or the AI starts making errors – the robopsychologist will step in, evaluate the situation, and determine where the process broke down, then make necessary changes to the AI lesson plan in order to get learning back on track.

Um, we actually have formulas and tools designed to do exactly that. We train artificial neural networks meant for complex pattern recognition with backpropagation, which uses a sigmoid function to flatten errors made by the networks during the training process. Of course the local minima problem rears its ugly head every so often, but we can always reset the seed values and try again until the error rate is down to acceptable levels. I really can’t think of anything behavioral therapists can do here. The last time one of my ANNs threw out errors, rather than call a therapist, I put in breakpoints at the beginning of each training cycle and debugged it. Every method call and variable assignment let me see each weight, each result, each input, and each output. Were a psychologist be able to do something similar, she would just pause pause your thought process and get an expert team of neuroscientists to study every chemical and electrical signal produced by every neuron step by step by step, from the inception of an idea to the final response. But we can’t do that with living things which is why we need to apply operant conditions to training while for AI this training would be a waste of time. If it takes me little effort to peer into a machine’s brain, why exactly do I need a robopsychologist?

If I had to sum up my main goal as a robopsychologist, it would be "to make machines think and learn like humans,” and ultimately, replicate creative cognition in AI. Possible? I believe it is. I’ll be honest, I haven’t always thought this way. The main reason for my past disbelief is because most of the people working on AI discounted the input of psychology. They erroneously thought they can replicate humanity in a machine without actually understanding human psychology.

And that’s the most perplexing statement of all in Kuszewski’s post. What AI researcher wants to respawn the human mind in a computer? To my knowledge we already have entities which think and act like humans. We call them humans. AI is intended to help address questions about the origins of cognition, provide new ideas in neuroscience and biology, and help us build smarter, more helpful machines that will compensate for our shortcomings or help us address needs for which we don’t have available human resources. We’re not trying to build the mythical movie robot who loves and really wants to be human at any cost like the movie adaptation of Bicentennial Man, and the machines we want to build can be shut off, examined, and corrected without help from those who want to embody the fictional job created in Asimov’s novels. But Kuszewski seems quite sure that we need to teach robots to think just like humans and plans to outline the reasons why in a future post. I’ll certainly read her rationale, but considering her knowledge of the AI world so far, forgive me if I already have a few doubts as to how grounded in reality and computer science they’ll be. If her replies to my questions were any indication of how she plans to elaborate her points for anyone who wants more detail, there will be plenty of talk about shifting paradigms and creative thinking with stern references to non-disclosure agreements…

update 02.09.2012: Kuszewski posted the second part of her post and it seems that the ideas she labeled as trade secrets in her final reply weren’t really all that secret after all, especially when you do a quick search with the company’s name. Follow-up post using the thesis of Syntience’s CEO is in the works for tomorrow. Also, I just want to ask why so many people who like to talk about AI insist that computer scientists want to replicate human brains neuron by artificial neuron? Why do that when it’s much more productive to let the neurons just grow and organize themselves and see what happens as they’re trained? But more on that in the next post…

For all their endurance and toughness, our vaunted Martian rovers suffer from a major handicap that makes a typical mission far less effective than we want it to be. In all their time on Mars, Spirit and Opportunity covered less than 20 miles combined. What’s the current record for the longest distance covered in one day? Several hundred meters. You can cover that in ten minutes at a leisurely pace. Granted, you’re on Earth and have two feet that were selected by evolution for optimal locomotion while the rovers are on Mars and have to be driven by remote control, with every rock, fissure, crevice, and sand trap in their way analyzed and accounted in prior to a move command being issued since getting a rover stuck hundreds of millions of miles away is a serious problem. But isn’t there anything we could do to make the robots smarter? Can we make them more proactive when they land so far away we can’t control them in real time? Well, we could make them smarter but that will cost you, both in expense and resources since they’ll have to think and keep on thinking while they work…

Technically, we could do what a lot of cyberneticists do and design artificial neural networks for our rovers and probes, treating the various sensors as input neurons and the motors as output neurons. We simulate all the environments virtually and train them using backpropagation. Then, when encountering certain combinations of sensory readings, these artificial neurons transmit the signals to the motors and the machine does what it should do in that situation. If we can interrupt ongoing processes to monitor new stimuli, we could even allow them to cope with unexpected dangers. Let’s say we have a work mode and an alert mode. The work mode is endowed with the ability to pursue objects of interest, the alert mode looks out for stimuli indicating that there may be something harmful coming. So when the work mode finds a rock to drill, another simultaneous thread opens and the alert mode starts scanning the environment. Should the tire slip or the wind pick up, the alerts go out to the rover to stop and reevaluate its options. Sounds doable, right? And it is. But unfortunately, there’s a catch and that catch is the energy that will be required to run all this processing and manifest its results.

Brainpower is expensive from an energy standpoint. There’s a reason why our brain eats up a fifth of our total energy budget; its processes are very intensive and they continue non-stop. Any intelligent machine will have to deal with a very similar trade-off and allocate enough memory and energy to interact with its environment in the absence of human instruction. That means either less energy for everything else, or that the rover will now have to come with a bigger energy source. The aforementioned MER rovers generated only 140W at the peak of their operational capacity to power hardware using 20 MHz CPU and 128 MB of RAM. With this puny energy budget, forget about running anything that takes a little processing oomph or supports multithreading. With a no-frills operating system and a lot of very creative programming, one could imagine running a robust artificial neural network on devices comparable to early-generation smartphones, something with a 200 MHz CPU and somewhere around 256 MB of RAM. To run something like that nonstop can easily soak up a lot of the energy generated by a Mars rover, and when you’re on the same energy budget as a household light bulb, this kind of constant, ongoing, intensive power consumption quickly becomes a very, very big deal.

Hold on though, you might object, why do we need a beefier CPU? Can’t we just link multiple small ones for a boost in processing capacity? Or, come to think of it, why bother with processing capacity at all? Well, since a rover has certain calculations and checks it constantly needs to make, you need to provide time for them to do what they need to do. Likewise, you need to keep processing data from your sensors to feed the neural net in the background and handle the actual calculations from it. Detecting threats in real time with what would be a state of the art system in the 1980s seems like a tall order, especially if you expect your rover to actually react to them rather than plow onwards as the alarms go off in its robotic head, resigned to its fate, whatever it may be. On top of that, just trying to run something like an artificial neural network while performing other functions requires an overhead to keep the computations separate, much less actually having the neural net command the rest of the rover. Of course there could be something I’m missing here and there’s a way to run an artificial neural network with such a light footprint that it could be maintained on a much leaner system than I outlined, but it seems very unlikely that if bare bones systems like those used for today’s rovers could be made to run a complex cognitive routine and act on its decisions, someone wouldn’t already be doing just that.

Lately there’s been a huge buzz about cognitive computing, computers built to work like organic neurons and instead of being programmed to carry out new tasks, they learn through trial and error. So the idea is that they will be able to handle fuzzy logic problems, be much more energy efficient, and solve problems faster. But the concern no one seems to talk about is the major downside of departing from the von Neumann architecture we know and love today. Rather than simply being able to load new software onto a cognitive machine, or just have it learn something new, it would have to be retrained and its previous task would be overridden. In some ways, we’d be going back to the very first computers, pre-wired to do certain computations and requiring their programmers to swap out hardware and reconnect cables and vacuum tubes to perform new tasks all while overriding what they did before. At the same time, we’d be gaining more efficient and easier to refine purpose built, lightning fast analytic engines requiring very little to no code changes to adjust for a new challenge.

All right, so what does this mean in perspective? It means that your computer is very, very unlikely to outwardly change for the foreseeable future. It may use magnetic CPUs, but it will still run an operating system, have an internal hard drive to store all your files, and you’ll still be using web browsers or web-enabled applications to do what you’ll do every day. It will also most likely be a tablet or tablet-like since this UI paradigm has taken off like a rocket right now thanks to its intuitive nature. But it probably won’t boast chips designed to mimic neural pathways because that would be rather impractical. An operating system already knows where your files live, how to get to them quickly, and how to load them into a user-friendly format. A hardware-based artificial neural network (or ANN) would have to either relearn its index tables, or add new nodes every time a file is added or deleted or moved. If that seems inefficient, it is. Cognitive chips are not built for that. They’re built to do what a typical computer can’t do, or can’t do in a timely manner, and speed up complex classification tasks and large statistical models. Right now they’re classifying handwriting and playing Pong, but the architecture to do a lot more is there. They just need to be trained to do the tasks over several tens of thousands of cycles.

However, once you train them for a task, that’s pretty much it for those chips. To understand why, consider the layout of an ANN. Imagine a maze of logic gates forming radiating pathways between each other. During their learning cycles, each input going into these logic gates is given a certain value, a numerical weight. If the sum of the values of the incoming signals exceeds a determined threshold, the logic gate fires off a binary signal which is treated like an input with a certain value in another logic gate which performs the same process and so on, and so forth until we reach the final logic gates the output of which determines the chips’ decisions. As the ANN learns, the input values are pruned and refined to make sure that the inputs are dealt with properly, a lot of corrections and averaging goes on for each logic gate and each input, and when those inputs are finally set, changing them will invalidate the ANNs’ training. If you want it to learn more, you could add new gates for processing additional related information and re-train the ANN, something that’s quite easy to do on a typical von Neumann computer since these logic gates are virtual. When we switch to hardware, we need to add the logic gates physically and in certain locations along the chip. And that’s not an easy task at all.

Our brains manage to do this because we have hundreds of billions of neurons constantly reinforced by work with outside stimuli and purged of unnecessary information by forgetting it. We learn and forget and our brain has plenty of room to spare because it prioritizes our knowledge and always has excess capacity thanks to a vast set of neuron clusters networked together into breathtakingly complex interacting systems. But again, as mentioned several times before, it’s not the size of the networks that count, it’s the connections. If you put together a giant cognitive chip with 300 billion logic gates, you won’t necessarily be able to get a useful result out of it. All of those 300 billion logic gates will train for the same task because there’s no differentiation that’s going to affect only a certain cluster of them and you’ll more than likely end up with an underfitted network since so many inputs make catching and correcting all the errors pretty much impossible. How to differentiate an ANN into structured clusters is actually an area with which I’m dealing, but more on that later, hopefully with an application or two with which you could experiment. The takeaway here is that doing things like expanding or differentiating an artificial neural network is easy virtually and very hard in a chip.

So where does this nitpicking leave us? Well, it leaves us with the idea to create purpose-built ANN chips for highly specialized computing and the problem of how to expand them since unlike today’s computers, they’re not going to be able to simply load a new program unless there’s a storage device which switches between a set of different neural networks, which while handy, would still limit the chip’s abilities. If IBM, the leader in the field right now, is planning to do that, I don’t know because they haven’t released much detail. And since we’re speaking of IBM, we should note that the person in charge of these efforts is Dharmendra Modha, who tends to make very big claims and inspire very inaccurate and misleading coverage on pop sci news sites. His cat brain simulator was actually little more than an ANN white-noise machine, and unlike MIT Tech Review has claimed in the first link of this post, it was by no means a preview of what these cognitive chips will do. Don’t take this the wrong way, Modha may well be on to something with this, but his track record does make me a little skeptical. It would be nice to see some of the work being done by his team in more detail because it is a fascinating area of research, but since he’s actually trying to sell it to DARPA and IBM’s high end customers, it seems unlikely that we’ll see a torrent of detailed papers on the subject anytime soon.