the amazing intuitive, understanding a.i.?

February 10, 2012

Having gotten nowhere with Andrea Kuszewski on finding out what AI system could benefit from a therapist while writing my previous post, I did what every blogger does when stonewalled by invocations of secrecy and non-disclosure paperwork. Using a little basic Google-fu, I found a good description of what the company that hired her, Syntience, seems to be up to just in time for Kuszewski to post a follow-up linking to the very ideas of the company’s CEO, Monica Anderson, she presented as a trade secret the day before. Anderson’s thesis on AGI models even has its own site and a fairly sparse blog in which she details her conception of how an AGI should act. Now her ideas aren’t outright wrong in a conventional sense because lets face it, there’s a lot of philosophy in the AI world and in philosophy we find are few answers, only opposing viewpoints or similar ideas merging to form a general consensus on a particular subject matter. But there is plenty of room for lots of technical objections. For starters, here’s a summary of Anderson’s theory of what’s wrong with AI today…

There has been a mismatch between the properties of the problem domain and the properties of the attempted solutions. […] I have argued (like many others) that Intelligence is the ability to solve problems in their contexts. In contrast, programming deals with discarding contexts and breaking down complex problems into simple subproblems that can then be programmed as portable and reusable subroutines in the computer. Programming is the most Reductionist profession there is and has therefore mainly attracted the kinds of minds which favor Reductionist approaches. That means that one of the most Holistic phenomena in the world – Intelligence – was being attacked by the most Reductionist researchers in the world. This is the mismatch; the biggest problem with AI research in the 20th century was that it was done by programmers.

Ouch! Certainly there’s a ring of truth to this and as a programmer I am from the classic reductionist school of computer science in which everything can be described as a sum of discrete parts. Programming is all about taking a problem and breaking it down piece by piece because computers need to know how to use the data you’re going to feed them through a database connection or manual input. But the notion that we just strip out the context falls flat on its face because context is how we determine control flow. In fact, context is everything, especially in modern, modular software architecture. Let’s say I have a program that stores the basic person and company information needed to keep track of customers for a particular company. Before coding, I’ll have to ask a client how this data will be used and by whom. Why? Because if it’s supposed to feed into programs used by different parts of the business, I’ll probably implement it as a service with its design based on things like what framework the other programs use and how old these programs are to ensure interoperability. If this data us supposed to be viewed and edited by customers and used in a very limited scope, I might not have to build a service at all. Are there other processes triggered by a user editing certain information? Can I just call them asynchronously so the user can keep on working on other things? It all depends on… the context.

Now that we have this covered, let’s return to Anderson’s main point, which is that AI was taken over by logical reductionists who value mathematics over intuition and that any AGI has to be intuitive. Well that explains why Kuszewski used the word reductionist as much as she could towards me since I fit the profile. However, what has been labeled intuition by Anderson can easily be captured by a reductionist model despite her objections on the matter. You see, while we can look at AI as a collection of discrete parts, it’s not only the parts we need to focus on, but how these parts interact. If you ever look at many modern AI models, they look very sparse. An enterprise project prototype for a small company looks positively brobdingnagian by comparison because it’s trying to specify every bit of data, every field, every layer, and every basic action, resulting in a big model to hold all these functional requirements. When it comes to AI, it’s not the size of your model that matters, it’s how you use it, if you pardon the paraphrasing. The interest is on emergent behaviors of these neural networks, not on specifying the control flow. We want to see how much can be done through the interplay of discrete and rather simple bits and pieces, the same same kind of emergence Anderson insists can only happen within intuitive frameworks and is actually a constant fixture of the AI reductionism she laments in her writing.

Granted, there was a time when many AI researchers thought that all you need is to cram a machine with the information it will need to function and deterministically program an intellect. That time ended many decades ago when the first artificial neural networks were running through their experimental paces. Today, the ANN is a standard design pattern in AI tasks and the focus so far has been how to create the smallest network with a potential for a wide variety of emergent behaviors. The reasoning is that because intelligence evolved from an otherwise simple collection of specialized, interconnected cells, we should be able to break those cells down to their simplest abstractions and focus on the interplay between them. All of these are things that Anderson’s thesis advocates but they’re being done in the 20th century reductionist model she insists is wrong. It seems that she simply hasn’t kept up with the literature in the field and came up with an idea that’s already caught on years ago. But why she thinks that specifying a problem domain is a bad thing while alluding to evolution isn’t clear to me since evolution places pressure on organisms with an intellect to complete discrete tasks, so the idea has the biological basis she demands, and having a computer just step out into the world to do nothing in particular won’t work since it need to have at some goal to achieve.

Share on FacebookTweet about this on TwitterShare on RedditShare on LinkedInShare on Google+Share on StumbleUpon
  • Greg. You say “…the ANN … focus so far has been how to create the smallest network with a potential for a wide variety of emergent behaviors.” Can you give some references to support that statement? Without evidence to the contrary I frankly think you are wrong this has been “the focus so far” for ANN’s. My impression is ANN’s have historically been used almost exclusively as classifiers.

    For the generation of “a wide variety of emergent behaviors” I can only think of Wolfram’s work using cellular automata, my own using vector cross-products of associations, and perhaps Anderson’s “intuition” work. None of that strictly in the ANN tradition.

    What work were you thinking of when you said “a potential for a wide variety of emergent behaviors” has been the focus so far with ANN’s?

  • Greg Fish

    My impression is ANN’s have historically been used almost exclusively as classifiers.

    True, many ANNs have been used as classifiers but they’re also being used for robot locomotion and behavior such as SONNs (Self-Organizing Neural Networks) and the work of Dario Floreano, Josh Bongard, and the Swarmanoid project. If you take a look at some of Floreano’s neural nets, they’re as bare bones as could be and Bongard is getting self-balancing, self-discovering machines using an ANN made of only seven neurons. The squashing function is a little complicated but not too horrible.

    There’s also emergence of a number sense in an artificial neural network intended to count random objects and that network is also not horribly complex. When working in pure abstract projects you’re rather limited to using them as classifiers or time series predictors. But working with a body interacting with an environment, you can use them to do the hard work of figuring out balance and motion for you.

  • Greg,

    Floreano’s evolutionary systems and Bongard’s robot motion are not bad. They might be verging on the kind of complexity we are talking about. But is it fair to say this has been “the ANN … focus so far”? Even you concede “When working in pure abstract projects you’re rather limited to using them as classifiers”. Why?

    I admit you also say “or time series predictors”. But who does time series without trying to parameterize them using classes?

    To tip my hand somewhat I’m thinking principally of language. But I would be interested to hear of anyone avoiding the use of classes in time series predicition for other problems too. I can’t think of any.

    Swarms could be good, but what are they being applied to? Luc Steels did some stuff with swarms and language some years ago (at Sony in France.) But as I recall he modeled words only. Classes again.

  • Greg Fish

    The “focus so far” bit can certainly be seen as an oversimplification and true, since my work is focused on robotics, I was slanted towards what I know best, just like you were thinking of language. In terms of abstract projects, you will often have to use facts and figures which don’t directly translate from the natural world without abstraction and the goal is to sort out data according to abstract classes. But now we’re getting into symbol grounding which is its own special can of worms.

    In robotics, you still use classes, but you’re not using the symbols represented by the class as an output but as another hidden layer. I’d love to go into further detail, but I’m afraid that at this point, I’d have to start drawing diagrams to illustrate the points. Short version, I don’t need to robot to classify a set of inputs as an actual symbol, but as just another step in its processing.

    In terms of swarm applications, the big thing is search and rescue, construction, and defense. The GRITS lab at GSU and U of Penn have a number of relevant projects on the matter you may want to check out.

  • That puts things into better perspective. At best saying “a potential for a wide variety of emergent behaviors” has been the focus so far is an oversimplification, even for the by no means fashionable sub-field of ANN’s.

    In fact I think it is fair to say even in the examples you mention “a potential for a wide variety of emergent behaviors” is not the direct focus. At best we have perhaps some disparate researchers stumbling in that direction, without being aware why they need to, or exactly what it is they are doing which is different. (If they do. It is hard to tell. They don’t talk about it explicitly.)

    There is still something missing, since, like Monica, I think there is a radical change here and it will be the solution to the impasse we’ve had in AI: consciousness, free will, and numerous other problems, mostly in cognition, but beyond cognition too.

    In short, you seem to be saying Monica’s work is empty, bluff, while I think she has her finger, if not on, then in the general direction of a major shift in AI.

    What interests me is that your characterization of “a potential for a wide variety of emergent behaviors” does capture this new idea rather well. You could say I think the problem is this has not been the “focus so far”. In general the “focus so far” has been to reduce problems to a small number of parameters. Do you agree this should not be the focus now, and that the examples you are citing are good because they abandon this assumption?

    Your comments about classes in robotics being used as a hidden layer worry me, because that would be in contradiction of this.

    Note: I’m not saying I think the systems can’t be small, just that we should understand even small systems may not have small or even finite parameterizations. Rather we need to understand that sometimes even very simple systems generate “a wide variety of emergent behaviors”, beyond our power to predict using anything more compact than the system itself. And crucially we need to allow these simple systems to generate this new structure, and not expect to be able to find parameterizations, reductions if you will, of it.

  • Greg Fish

    In short, you seem to be saying Monica’s work is empty, bluff, while I think she has her finger, if not on, then in the general direction of a major shift in AI.

    That’s not at all what I’m saying. I’m saying the she’s pointed in the right direction, but a lot of researchers have beat her to the punch well before she started documenting her ideas. However, her characterization of logic vs. creativity and redictionism vs. intuition don’t really add up to a solid methodology for machine learning especially because it’s apparent that she’ll end up moving her models in the exact same direction as Bongard and Floreano, both very strict reductionists in their approach.

    the “focus so far” has been to reduce problems to a small number of parameters.

    Yes, and now the direction in which the parts of the AI field in which I’m active the shift is to see how to make those small numbers of parameters interact with each other to create something greater than the sum of their parts. I think you’re focusing very hard on a single out of context phrase related to the small explosion of papers to which I’ve been directing you perhaps to the detriment of the big picture. Obviously classifiers did not become obsolete and no one has abandoned them, but there are more uses for a generic ANN package than classifiers for abstract input data. You seem to be saying a lot of the same things I am saying but something is obviously not being conveyed well on my part because you seem to think that I disagree with your final conclusions.

    Do you agree this should not be the focus now, and that the examples you are citing are good because they abandon this assumption?

    I disagree than decomposition of a cognitive model is a bad thing but fully agree that a built-in logical determinism is erroneous because it narrows down the problem space and stifles emergence. The works I cite start with the premise that a machine has one well defined set of inputs and outputs and using these inputs and outputs, should be able to perform a certain task. After wiring up the inputs to the outputs and ways to see if the problem has been solved, they let the machine loose and see how it will solve a real world problem. What they’re saying isn’t that the problem is reduced to a discrete set of easily identifiable parameters, but that a device enabled with a few simple tools should be able to essentially evolve a solution a certain problem by trial and error.

    That’s actually another area where I disagree with Anderson. She seems to lack a real understanding of evolution and the fact that creatures don’t so much intuit solutions as they come up with them by trial and error in response to selective pressures. Humans didn’t intuit their way into civilization, they found that living in large groups with a divided range of specialties was more efficient and allowed more of them to survive and keep reproducing. Intuiting civilization would mean that humans were just naturally pulled to forming villages, then towns, then cities, then city states. History suggests this wasn’t the case and civilization as we know it rose as a result of conquests and competitions for resources in which humans chose formal allegiances, not as extensions of family groups from hunter and gatherer times.

    Your comments about classes in robotics being used as a hidden layer worry me…

    As I warned, this was an extreme oversimplification on my part. Honestly, I just did not really want to start a discussion about symbol grounding in embodied and abstracted systems because that’s one of those things we could discuss until pigs fly and lack a mutually satisfactory grounding mechanism. It’s just one of those topics that’s more of a philosophical/linguistic dilemma then a propositional logical construct.

  • Sorry if I’m concentrating too hard on one phrase Greg. I find it useful to base my argument around words used by the other person, otherwise everything degenerates to an argument about words, as you say “more of a philosophical/linguistic dilemma then a propositional logical construct.” Actually I have a problem with propositional logical constructs too! Truth is relative, but that’s getting ahead of ourselves. Equally it is usually good to focus on one or two such statements or the discussion becomes too dissipated.

    Let’s try to establish some terms of reference. What do you think of Stephen Wolfram’s ideas, in particular what he calls computational irreducibility?

  • Greg Fish

    Wolfram’s computational irreducibility is essentially chaos theory for computation and it’s certainly applicable in the right problem space with programs that aren’t too simple to allow for emergent behaviors. For programs with an extremely rigid control flow due to their sheer simplicity, there can be no emergent behavior. But with any slightly more complex flow branches, you can certainly get some interesting behavior. The issue is, is this behavior a bug or just another way to solve a problem?

  • I largely agree with that. I also think we might see Wolfram’s ideas as “chaos theory for computation”.

    There are some caveats. I don’t know if Wolfram would agree. Chaos theory itself is hardly well defined. I don’t think the implications are understood or worked through. Chaos theory as developed so far may or may not contain all the relevant ideas.

    Anyway, taking that agreement, in the context of trying to get further agreement between us. Why do you think Wolfram characterizes his work as “A New Kind of Science” and not simply as “emergence” or ANN’s?

    You might appreciate I’m trying to get you to agree there is some novelty here. This is not all standard theory for ANN’s. Even in your robotics sub-field I don’t think these ideas, Wolfram’s for reference sake, but some other independent developments along the same lines, are part of the standard way of looking at problems yet.

    Even a Floreano or a Bongard. Are they taking Wolfram’s ideas on board? What is it about the way they structure their problems which reflects this chaotic character? If they understand their systems may be chaotic, may need to be chaotic, shouldn’t they mention that? Emergent does not necessarily mean chaotic. In particular evolution does not imply chaos. Evolution might not even imply emergence in a strict sense. Though chaos might supply a rich well of structure to power evolution, if we are aware of the need, and specifically allow our systems to generate it.

    So to bring this back to the original point of contention, I’m arguing these ideas are new. This is what I see as new, reading between the lines, in Monica’s work. And I like her work just because I don’t see this need for a new way of looking at problems, being discussed elsewhere in the state-of-the-art.

    Her language may be somewhat idiosyncratic and confrontational, but who else talks about chaos in AI? (And I think Monica is talking about chaos in AI, though she may well disagree with me on that!)