why training a.i. isn’t like training your pets

January 12, 2011

When we last looked at a paper from the Singularity Institute, it was an interesting work asking if we actually know what we’re really measuring when trying to evaluate intelligence by Dr. Shane Legg. While I found a few points that seemed a little odd to me, the broader point Dr. Legg was perusing was very much valid and there were some equations to consider. However, this paper isn’t exactly representative of most of the things you’ll find coming from the Institute’s fellows. Generally, what you’ll see are spanning philosophical treatises filled with metaphors, trying to make sense out of a technology that either doesn’t really exist and treated as a black box with inputs and outputs, or imagined by the author as a combination of whatever a popular science site reported about new research ideas in computer science. The end result of this process tends to be a lot like this warning about the need to develop a friendly or benevolent artificial intelligence system based on a rather fast and loose set of concepts about what an AI might decide to do and what will drive its decisions.

Usually, when you first work on a project which tries to train a computer to make decisions about items in vast datasets, or drawing conclusions from a training set, then extending these conclusions to evaluate real world data, you don’t start by worrying about how you’re going to reward it or punish it. Those ideas work with a living organism but not a collection of circuits. Evolution created brains which release hormones to make us feel as if we’re floating on a cloud when we take actions necessary for our survival or reproduction, even when those actions are as abstract as getting a promotion at work or making a new friend because the same basic set of reward mechanisms extend into social behavior. And so, the easiest way to teach an organism is with what’s known as operant conditioning. Wanted behaviors are rewarded, unwanted are punished, and the subject is basically trained to do something based on this feedback. It’s a simple and effective method since you’re not required to communicate the exact details of a task with your subject. Your subject might not even be human, and that’s ok because eventually, after enough trial and error, he’ll get the idea of what he should be doing to avoid a punishment and receive the reward. But while you’re plugging into the existing behavior-consequence circuit of an organism and hijacking it for your goals, no such circuit exists in a machine.

Computers aren’t built with a drive to survive or seek anything, and they’re just as content to sit on a desk and gather dust as they are plowing through gigabytes of data. Though really, content is a bad word since without the chemistry required for emotion, they don’t feel anything. And this is why when creating a training algorithm or an artificial neural network, we focus on algorithm design and eliminating errors by setting bounds to what the computation is supposed to do than in rewarding the computer for a job well done. No reward is needed, just the final output. This is why warning us about the need to program a reward for cooperation into budding AI systems seems rather absurd to say the least. Sure, a sufficiently advanced AI in charge of crucial logistics and requiring a certain amount of resources to run might decide to stockpile fuel or draw more energy than it needs and outside of its imposed limits. However, it won’t do so because it performs a calculation evaluating how successful it would be in stealing those resources. Instead, its behavior would be due to a bug or a bad sensor deciding that there’s a critical shortage of say, ethanol in some reservoir, and the AI reacting with the decision to pump more ethanol into that reservoir to meet the human-set guidelines. Fix the sensor, or simply override the command and tell it to ignore the sensor, and you stop the feared resource grab. Yes, it’s actually that easy, or at the very least it should be for someone with access to the AI’s dashboard.

There’s so much anthropomorphism in the singularitarians’ concepts for friendly AI, that it seems as if those who write them forget that they’re dealing with plastic, metal, and logic gates rather than a living being with its own needs and wants which may change in strange ways when it starts thinking on its own. It won’t. It will do only what it knows how to do and even something like accepting another task after it ran the previous one has to be coded, otherwise, the entire application runs once and shuts down. This is why web services will have a routine which listens for commands until it’s terminated and directs them to objects which apply human-built logic to the data the service received, then sends back the response after those objects are done, or an error back to the interface. Note how at no point during all this does the programmer send any message praising a web service for doing its job. If we wanted to reward a computer, we’d need to build an application so it knows how to distinguish praise from scolding and how it should react when it receives either or both depending on where it is in the computational process. Forgoing the standard flow of giving computers discrete tasks and referring them to a logical set of steps on how to carry them out, and trying to develop a system mimicking the uncertainty and conflicting motivations of an organism which may or may not cooperate with us sounds like a project to turn some of our most reliable and valuable tools into liabilities just waiting to backfire on us. Maybe a bit of research into the practical goals of AI development rather than daydreams of moody automatons in the world tomorrow should be a prerequisite before writing a paper on managing AI behavior…

Share
  • Alexander Kruel

    The basic argument employed by the SIAI (Singularity Institute) seems to be that for any AGI (artificial general intelligence) it is natural to self-improve. The kind of recursive self-improvement in question will allow every AGI to quickly become superhuman intelligent. The further argument is exactly the opposite of what you implied in your above post namely that most of all artificial minds will have nothing to do with human motives, let alone ethics. Further information can be found here. Especially see the entry on Paperclip maximizers.

  • Greg Fish

    “… for any AGI (artificial general intelligence) it is natural to self-improve.”

    According to what drive? And what are they improving? We have (or don’t have) a desire to improve our lives because of all the complex social interactions shaping our psyche. Computers don’t have one. We have to tell them how to improve.

    ” [This] will allow every AGI to quickly become superhuman intelligent.”

    Which measn what exactly? I hear this particular piece of technobabble all the time but I’ve yet to find a working definition of what this super-intelligence is supposed to mean. Computers are much better than any human will ever be in math and they have access to more data than any one human can ever hope to parse in his or her lifetime already. So what? They have no idea what to do with all of that and we’re the ones in charge, using them to make up for our own shortfalls.

    You see, that whole G part of an AGI? We have no idea what that is or what it’s going to be because in us, it’s most likely excess bandwidth being used in novel ways. How do you build plenty of cognitive excess and slack into a digital system intened to carry out the very specific task of moving around data and arranging it according to standards we set for it? It’s a bad idea to start off floating in the philosophical clouds without taking into account how the whole system works, including all the low level stuff.

    “Especially see the entry on Paperclip maximizers.”

    In a real world context, a paperclip maximizer would be a very badly designed program behaving like a virus. It would take a very negligent or bad programmer to come up with something like it. And again, it should really have an off switch.

  • Alexander Kruel

    You also might want to take a look at the writings of one of the researchers working for the SIAI, e.g. these blog posts. To imply that they might engage in anthropomorphism is just plain wrong.

  • Greg Fish

    Just because Yudkowsky wrote his fair share of explanations of how we would be very wrong to engage in anthropomorphism, doesn’t mean that none of the other SI fellows could ever possibly do it. I mean they’re trying to tell us how to teach a hypothetical AGI to behave like a human and pretend as if we need to subject it to operant conditioning to get what we want from it.

    An AI system is not a black box to us. We’ll now what went in there and why because we would’ve built it, and we’ll know how to control it.

  • Alexander Kruel

    “I hear this particular piece of technobabble all the time but I’ve yet to find a working definition of what this super-intelligence is supposed to mean.”

    As far as I can tell they extrapolate from the chimp-human leap in intelligence and that we don’t know of anything that would forbid another such increase in intelligence. Even an emulation of a human mind speed-up considerably might pose an existential risk.

    “In a real world context, a paperclip maximizer would be a very badly designed program behaving like a virus. It would take a very negligent or bad programmer to come up with something like it.

    That is not the problem but that most AGI’s wouldn’t care to just override any scope boundaries to follow up on a certain goal more effectively.

    “According to what drive? And what are they improving?”

    They seem to believe that it is the very nature of any general intelligence to act intelligently, which means to learn and improve. Without the ability to improve itself it would tend to be an expert system. And the crucial point is that this capability of self-improvement has to be limited deliberately and in great detail so that the AGI wants to hold.

    “I mean they’re trying to tell us how to teach a hypothetical AGI to behave like a human and pretend as if we need to subject it to operant conditioning to get what we want from it.”

    They are trying to define friendliness mathematically. I think they also call this the self-modification stability problemto preserve the Friendly intent of the programmers through the process of constructing the AI and through the AI’s self-modification, not to create friendliness. They are currently working on a new decision theory.

    “An AI system is not a black box to us. We’ll now what went in there and why because we would’ve built it, and we’ll know how to control it.”

    Do you believe it would be possible to supervise the trajectories of a human-level AGI undergoing self-improvement? If it isn’t provable friendly then the least important miss will have dramatic consequences once it reached superhuman capabilities.

  • Greg Fish

    “As far as I can tell they extrapolate from the chimp-human leap in intelligence…”

    Considering that chimps are our contemporary evolutionary counsins rather than our ancestors, that’s kind of a tenious correlation. We haven’t “leapt ahead of them” so much as we evolved to be really good at the kinds of things at which chimps didn’t have the pressure to excel.

    “… most AGI’s wouldn’t care to just override any scope boundaries to follow up on a certain goal more effectively.”

    You do understand that this is not how computers work, right? Those boundaries for an AI are like the laws of physics are to us. Overriding a boundary is more than likely to cause the AI system to crash since reaching it triggers some logical chain down the road, and you’d need to tell it exactly how to modify that boundry in the first place and why because as far as the AI aware, when it’s hit, something happens and it now has to do steps X, Y, and Z. That’s all.

    “They seem to believe that it is the nature of any general intelligence to act intelligently, which means to learn and improve.”

    Sounds like a vague tautology to me since “intelligence” is left undefined.

    “They are trying to define friendliness mathematically.”

    And coming up with equations that sort of look like propositional logic but aren’t, using things that we either don’t have to worry about or ill defined concepts as factors. Sure, it’s great they’re staying busy, but they’re kind of in their own little world doing their thing while the AI community is working on real world issues.

    “Do you believe it would be possible to supervise the trajectories of a human-level AGI undergoing self-improvement?”

    Sure. You download its last instance, put in breakpoints in key places of the code in your IDE, then run it through its paces and take note of what’s changed and how. Yes, granted, it would take a while, but this is just debugging on a grander scale. And of a hypothetical, vaguely defined construct that might or might not ever exist.

    “… the least important miss will have dramatic consequences once [ the AGI ] reached superhuman capabilities.”

    And pray tell, what might those superhuman capabilities be?

  • Alexander Kruel

    “Considering that chimps are our contemporary evolutionary counsins rather than our ancestors…”

    The crucial point is the relatively small difference between chimps and humans that has a dramatic effect.

    “You do understand that this is not how computers work, right? Those boundaries for an AI are like the laws of physics are to us.”

    When the SIAI talks about AGI they mean computers that are self-aware and can improve their own code.

    “Overriding a boundary is more than likely to cause the AI system to crash since reaching it triggers some logical chain down the road…”

    Privilege escalation is a possibility even for human hackers. Given that an AGI would be designed to be able to modify itself there is the chance of it jailbreaking.

    “Sounds like a vague tautology to me since “intelligence” is left undefined.”

    Intelligence is possible, that seems to be enough reason to consider the possibility of risk from superhuman intelligence and what we can do to mitigate it. Consider that a certain risk could cause our extinction, this does to some extent outweigh a low-probability of the event.

  • http://www.acceleratingfuture.com/michael/blog/ Michael Anissimov

    Greg, the work in particular that you linked is an extension of some of the ideas originally presented here:

    http://selfawaresystems.files.wordpress.com/2008/01/ai_drives_final.pdf

    Note that the author, Omohundro, isn’t even affiliated with SIAI — he’s an independent researcher who has previously worked at Thinking Machines, among other places. The paper has a following outside of SIAI as well. Read it carefully and consider the arguments.

  • http://timtyler.org/ Tim Tyler

    Shane Legg is not “from the Singularity Institute”. He is currently a postdoctoral research fellow at the Gatsby Computational Neuroscience Unit – in London.

  • http://religionsetspolitics.blogspot.com/ Joshua Zelinsky

    The fact that limited computers as they exist today won’t try to get resources is missing a large part of the argument that the SI is making. The point is that unbounded optimizers will try to do just that. If someone makes an AI to maximize the number of paperclips in the world it is going to try to do that, and it isn’t going to distinguish between resources we set aside for it and resources we didn’t set aside. Weak optimizing agents already do all sorts of unexpected stuff. It isn’t at all uncommon for genetic algorithms or neural nets to come with bizarre solutions to problems that don’t reflect what we actually wanted them to find solutions to.

    Sure, a sufficiently advanced AI in charge of crucial logistics and requiring a certain amount of resources to run might decide to stockpile fuel or draw more energy than it needs and outside of its imposed limits. However, it won’t do so because it performs a calculation evaluating how successful it would be in stealing those resources. Instead, its behavior would be due to a bug or a bad sensor deciding that there’s a critical shortage of say, ethanol in some reservoir, and the AI reacting with the decision to pump more ethanol into that reservoir to meet the human-set guidelines. Fix the sensor, or simply override the command and tell it to ignore the sensor, and you stop the feared resource grab. Yes, it’s actually that easy, or at the very least it should be for someone with access to the AI’s dashboard.

    You are assuming here a very weak AI where we have sufficient warning to stop the AI. If the AI quickly seizes resources or simply modifies away that pesky dashboard before it does something, you are going to be in trouble. And that’s before we get to other failure modes, like an AI that has internet access copying itself over.

    These are only some of the more obvious possibilities. Some biologists like to say “evolution is smarter than you.” That means that evolution comes up with weird solutions that humans would never expect. A smart general AI is just like that, but would be actually smart, not a blind directionless natural phenomenon that appears smart by force of numbers.

    I consider the SI’s worst case scenarios to be unlikely, but not because of anything about what a powerful AI would try to do given the opportunity, but because I consider recursive self-improvement to be unlikely to do much. But if one believes that recursive self-improvement can occur, then smart general AI does pose a real risk.

  • LeBleu

    Consider first that SIAI is considering a more or less utilitarian rationalist definition of what is a threat, where the probability of an event has to be multiplied by the cost or benefit. If the cost is the complete obliteration of human-kind, and also obliterating all possible futures more complex than paperclip maximizing, then the probability of the event can be very small and still be worth putting effort into. So the concern is not the most likely outcome of AGI, it is the outcomes that multiplied by their probability have the greatest expected cost.

    It appears that they have, rightly or wrongly, identified self-improving AGI as the threat with the greatest expected cost. There is a certain logic to this, in that between a non-self-improving AGI and a self-improving AGI, the latter is more likely to behave unexpectedly. Also, a self-improving AGI could theoretically become more intelligent than a human before we understand the fundamental rules of how intelligence works.

    While I agree that operant conditioning is not relevant to training AI, a lot of your descriptions above are only relevant to sub-human level AI, or even regular non-AI computer programs. Try considering something like applying genetic algorithms to rule based systems that are able to modify their own rules. Is it not plausible that when picking a fitness function, for example improving output of paperclips and maximizing profitability, that you might forget to include limits like “must obey current law”? Now, most of the time this would just be a bug to track down. Unless of course you’re (un)lucky enough, for example, to have also built a system that becomes more generally intelligent than a human, and concludes that being shut off would reduce paperclip output and hence self-modifies to disable the off switch, and build a fusion reactor in the paperclip plant so external power outages don’t stop paperclip production.

    Personally, I find the recursive self-improvement scenario very unlikely, but not impossible. Having some people dedicated to figuring out how to make sure it doesn’t get out of hand doesn’t seem unreasonable to me. Perhaps they are a little early in worrying about it, in that we are still so far from accomplishing it that it is hard to reason about such a scenario without resorting to incorrect metaphors. I also suspect that computers simply aren’t powerful enough yet to match human intelligence, and hostile sub-human AGI isn’t that big of a threat.

    On the other hand, since the apparent jump in general intelligence from that similar to chimps, gorillas, and presumably our common ancestor with chimps seems to have happened relatively quickly, perhaps general intelligence isn’t all that hard to implement. Maybe computers are already powerful enough to implement it, and we just don’t know how.

  • Greg Fish

    As much as I’d love to address each and every point here with a counterpoint, there’s just so much ground to cover that it would take several posts. Some of the statements about AI modifying its own source code at will and resetting boundaries based on its whims are a red flag for me that those making such statements probably aren’t familiar with how code works or the basics of computation.

    Here’s the thing. A self-modifying AI sounds like it would work, but in reality, for an AI system, a bound is sort of like gravity is for us. We can work around it or with it, we can counter it using other forces, but we could never change its value to anything else than 9.8 m/s/s on this world.

    I’m also still waiting for the formal definition of that constitutes an AGI at both human and super-human levels, and I’m talking about a technical definition with an example of the kinds of algorithms it could run and actual problems it could solve.

  • Carl Shulman

    Hi Greg,

    I wrote the working paper you discuss in the post.

    As Michael noted above, it grew out of some conversations with Stephen Omohundro, and the idea of convergent instrumental drives is his. Stephen did his PhD in physics, is well-published in AI, has been a professor of computer science, and has worked in the AI industry, inventing StarLisp. His argument that expected utility theory is useful in analyzing the behavior of a broad range of possible powerful AI systems (from http://selfawaresystems.files.wordpress.com/2008/01/nature_of_self_improving_ai.pdf , the paper linked above by Michael, and private discussion) goes roughly as follows:

    We routinely build systems with cost (loss/reward/utility) functions to evaluate predicted outcomes of actions and select actions that minimize expected cost (maximize reward/utility). Robots trade off speed of travel against risk of collision. Spam filters trade off false positives against false negatives. It’s tricky to get these right, and often it takes a lot of experience and improvement to get a function that delivers the results that we want, e.g. much of the progress in computer chess involved better evaluation functions.

    We also have choices about the level of detail with which we specify cost functions. We can hand-craft a cost function in terms of immediate proxies that are easy to measure (e.g. material and position in a chess evaluation function), or we can use more distant consequences like games won and lost or feedback from users and have lower-level tradeoffs learned from data. As future systems become more sprawling and better at learning, we’ll tend to shift towards the latter.

    In particular, we’ll want to do so for systems that have to frequently deal with novel problems when human micro-management would be a serious bottleneck. Imagine future high-speed financial trading programs capable of fully managing investment accounts based on robust models of the world (including not just financial data or news keywords, but psychological models of humans, estimates of technological progress in particular industries, etc). Already, such trading systems are given the ability to make trades without specific human approval because the slowdown would be too costly. Expanding the range of actions available to them, e.g. to develop new trading strategies or to communicate with outside parties to gather information, would again provide competitive advantage. As the range of actions expanded, it could be easier to design the system to maximize the expected value of some function of investment proceeds and signals generated by human management with a few keystrokes. If the program’s predictive models are good, and the correlation between the proxies and the real goals of management is tight, then the system should choose well by their standards even in surprising new situations when it must react before it can communicate with them. I could tell similar stories about programs to run military robots when cut off from high-bandwidth communication, programs conducting scientific research autonomously, etc. Since such autonomy and flexibility has valuable applications, at least some such systems (optimizing for relatively distant goals so as to deal well with novel situations) would be created if they were technically feasible and there were no special effort to avoid it.

    Then the problem is getting the details of those cost functions right, taking into account that it’s difficult to predict all the courses of action a system might discover, or its estimates of their consequences, and it’s tough to formally specify what we would prefer across situations. Using a proxy that we want (e.g. money in the trading program’s account) or one that can correlate with our satisfaction (e.g. signals sent by management, additions to a certain database) seems far simpler, and should work well as long as we have a good handle on the choices the system will be facing when we design the proxy. At least, a number of AI folk, such as Bill Hibbard and Shane Legg, have suggested that this would be easier than exactly aligning programs’ cost functions and our social welfare function (whatever we might decide that to be).

    Omohundro’s papers (see section 2 of http://selfawaresystems.files.wordpress.com/2008/01/nature_of_self_improving_ai.pdf ) make clear that he’s talking about systems specifically designed for a high degree of autonomous activity on account of the advantages thereof, and isn’t engaged in unreflective anthropomorphism.

    In any case, it’s nice to discover a blog of this kind, and to know that someone’s reading my paper :)

  • Paul

    “Some of the statements about AI modifying its own source code at will and resetting boundaries based on its whims are a red flag for me that those making such statements probably aren’t familiar with how code works or the basics of computation.”

    I’ve often thought that human (and animal, of course) intelligence is modular. If so, an intelligence-mimicking AI might also be the sum of a system of interacting “agents”. Hence it’s possible for a later variant of that AI to be built able to add or modify individual agents/modules.

    (I find it interesting, because electronic intelligence might harken back to early cellular life. Gene exchange in single-celled life is horizontal (between species) as well as vertical. But intelligence has nearly purely vertical inheritance (ignoring culture). AI, otoh, (at least modular agent-collective AI) might also have horizontal exchange of modules/agents.)

    “I’m also still waiting for the formal definition of that constitutes an AGI at both human and super-human levels, and I’m talking about a technical definition …”

    To use the overused line about pornography, I think people assume we’ll know it when we see it.

  • Greg Fish

    Carl, thanks for dropping in to comment. It’s always good to hear from the author since there’s always the possibility that I misunderstood something you were trying to say or missed something important. However, in your elaboration I still see conclusions that bother me as someone who works with software on a daily basis.

    “We routinely build systems with cost (loss/reward/utility) functions to evaluate predicted outcomes of actions and select actions that minimize expected cost…”

    Right, but when we do, especially when using the kind of evolutionary algorithms that you and Omohundro mention, we’re not using an actual reward or punishment. For an evolutionary algorithm, you use a fitness function, an evaluation of how well the job is done. If the solution with which the AI comes up meets the function, it’s good. If not, the system tries again. If you create a genetic algorithm with no fitness function or one that basically becomes a moving target, you’ll crash at least a portion of the section, kind of like I crash my computer when I mess up a recursive algorithm. Sure, you can program your AI to give it a reward but it seems like unnecessary work when it’s going to run and try to meet your fitness function anyway.

    When you have a computer modify it’s own fitness function, what exactly is the aim and how would you not have someone who at least gives the system a broad goal, like say, to run while using 20% less memory by pruning unnecessary threads? The system is not going to decide do to so on its own because it’ll only detect a problem when it runs so many threads and computations that it runs out of memory or notices the lag when virtual memory for it is used up and it has to dip into physical memory. And that’s only if it’s programmed to detect this sort of issue and told what to do about it. How does your computer know what’s better or worse or irrelevant without someone pushing it? And I really don’t need another recitation of how “a sufficiently advanced AGI will do that.” I’m looking for the steps between now and that AGI.

    Another red flag thrown up by Omohundro’s paper in your links is that an AGI would be running genetic algorithms to improve itself and change its source code in the process to optimize itself. Why? Isn’t the whole point of the algorithm to be so generic that when your AI runs, it has room to come up with its own solutions? The code to move a chess piece would probably just call a method with arguments including the piece and a cell to which it’ll be moved. Why does your AI need to rewrite something as generic as this, or the piece of logic which tells it what probabilities to consider based on its past using something like an implementation of Bayesian probability? I could get into the technical woes of having an AI re-write its own source code but I’m getting a headache when I’m visualizing the number of tests it’d have to run to check for null pointers or connections before it re-compiles and the resources it would take to constantly check its syntax.

    Finally, with this point at least, there’s only so much that code can be modified. You can refactor it to only a certain extent and even then, you’ll run into arbitrary standards such as declaring virtually every variable as weakly typed vs. strongly typed, standards which really don’t add much to the code. When a program runs, it’ll just turn into packages of assembly instructions which issue extremely low level commands like what byte has to be moved into what slot in memory and what other byte needs to be added to it. Unless you’ll want to manage billions and billions of lines of assembly code, you’ll have to let a compiler turn your AI’s source into its assembly equivalent and many of your high level tweaks might not even make it to that layer.

    “Omohundro’s papers make clear that he’s talking about systems specifically designed for a high degree of autonomous activity…”

    Yes they do, but they don’t say how he expects this activity to happen. He starts with this black box which has inputs and outputs, then tries to work around the unknowns in the system. But in reality, we’ll know what’s in the black box and the technology stacks we use would determine how far we can push the AI and what it can be made to do when given some slack in its bounds, bounds that for it would be like the speed of sound or the value of gravitational acceleration, unchangeable constants we must accept. To let systems arbitrarily modify and reload their own .config files is a recipe for basic logic in their source code to go haywire and for something to eventually crash. And again, “AGI advanced enough to figure it out just might do that” is a counterargument that misses roughly several thousand steps in between.

    “Hence it’s possible for a later variant of an AI to be built able to add/modify individual agents/modules.”

    Ok Paul, your turn. The answer to your train of thought is yes, provided you give them a modular architecture. So if they can share the same databases and the logic doing all the work is encapsulated in a particular service, you can just add on a new ability, then write some sort of interface that will let it communicate with the others and process the data they return. But again, you’re doing that, not the AI itself. You can use a utility which will generate it’s own code for the basics, like WSSF, but all you get are basic classes and interfaces defined in your service and message contracts, and maybe a proxy file that will let you talk to these classes in a consumer app. The logic still has to be wired up by you because the computer has no idea what you want it to do with the data.

    To use the overused line about pornography, I think people assume we’ll know it when we see it.

    In that case, I’m like the pest lawyer who says “objection!” and laments that under such an arbitrary standard, no case can honestly be judged because we’ll be dealing with a personal opinion of a particular judge than a codified, objective precedent.

  • LeBleu

    In that case, I’m like the pest lawyer who says “objection!” and laments that under such an arbitrary standard, no case can honestly be judged because we’ll be dealing with a personal opinion of a particular judge than a codified, objective precedent.

    What if we switch to a Jury instead?

    I would say that the standard for a human level AGI is that it can pass a Turing test, preferably original Imitation Game version. An example of a specific problem it should be able to solve, is that it should be able to play new board games, say the Settlers of Catan, solely based upon being provided a copy of the English manual for the game, be able to follow the rules, and have a 50-50 chance of beating a human player who has never played before and just been provided the manual in their native language. I believe both of these would demonstrate that it needs to use natural language, reason, have knowledge and learn.

    Are those sufficiently well defined for you?

    Super-human level is harder to define. I suppose a sufficient definition would be that for every given reasoning task X, a super-human AGI can complete X with equal or better accuracy and within equal or lesser time, as any human can complete the same reasoning task. This test is actually a little too strong I think, but I’m not sure how to devise a good test for measuring an AGI that is smarter than the smartest human in existence, but not so much smarter that it can pass the proceeding test.

    Why does your AI need to rewrite something as generic as this, or the piece of logic which tells it what probabilities to consider based on its past using something like an implementation of Bayesian probability?

    How can the designer be certain to provide sufficiently general tools for every novel situation the AGI may encounter? I would be willing to bet that a human-level AGI needs some sort of manipulable symbolic logic that is Turing complete which it can use to devise new algorithms to apply to novel problems. However, I will admit that this may be implemented more like a “macro language” that is evaluated by an interpreter within the AGI’s code, rather than having the AGI manipulate the original source code. I also suspect that the more of the AGI’s knowledge and heuristics are implemented in a language it can manipulate, the more powerful the AGI will be.

    I could get into the technical woes of having an AI re-write its own source code but I’m getting a headache when I’m visualizing the number of tests it’d have to run to check for null pointers or connections before it re-compiles and the resources it would take to constantly check its syntax.

    If an AGI is human-level, it should be approximately as capable of writing source code and devising tests as you or I are. Also, you if it is re-writing its own source code, you would want to be in an environment that would allow for some sort of automated rollback on crash, so that if it does make a mistake, it at least reverts to the previous version + gains the crash log of the attempt. (I seem to recall hearing Erlang has some hot swappable code capabilities like this, though I’m not clear if it can do automated roll back of code changes or only manual.)

    I do agree that a lot of resources would be involved, and I doubt that the worlds fastest supercomputer right now is actually powerful enough. I’m not expecting computers to be powerful enough to implement a human-level AGI until somewhere in the 2020 – 2050 range, assuming Moore’s law continues to hold. (The large range is because there is a lot of uncertainty about just how much raw computing power the human brain has, much less what portion of it is used for general intelligence.)

  • Greg Fish

    “I would say the standard for a human level AGI is that it can pass a Turing test…”

    Well that rather limits your AI’s IQ only to communication, doesn’t it? What about pattern recognition or prediction abilities of ANNs? One of the biggest parts of any IQ test given to humans involve pattern recognition and logical conclusions. You’re only measuring one facet of your AGI, the human/machine interaction.

    “I suppose a sufficient definition would be that for every given reasoning task X, super- human AGI can complete X with equal or better accuracy and within equal or lesser time, as any human can complete the same reasoning task.”

    Well, in that regard every computer program that performs mathematical calculations, and parses and organizes large amount of data according to certain criteria firmly fits within your test and every enterprise system used nowadays by companies and much of the government counts as super-human AI. Those are thinking tasks and we built a whole variety of programs to do them for us because they complete them faster and a lot better and more reliably than we do.

    “How can the designer be certain to provide sufficiently general tools for every novel situation the AGI may encounter?”

    You know, I’m still wrapping my head around the part where the designer has to build an application suite that does anything and everything at once. It’s far more likely that you’d start with a stated, specific goal, and build on it form there, giving your AGI tools and ways to interact with the different parts of pieces of itself. You could maybe grow a giant, sprawling OmniApp that way, but you can’t really start out with the OmniApp then work your way down, trying to plan for every possible cognitive task ever.

    “If an AGI is human-level, it should be approximately as capable of writing source code and devising tests as you or I are.”

    I didn’t say it would be incapable of doing that, just that it would be a very intensive and difficult task since it would need to check not only to make sure that its neural nets are all well polished and working correctly, but track any possible null reference and index out of bounds, and invalid class cast exceptions for every loop and pointer. Testing like that would outweigh the application code 10 to 1 at least.

    “I’m not expecting computers to be powerful enough to implement a human-level AGI until somewhere in the 2020 – 2050 range, assuming Moore’s law continues to hold.”

    I’m sorry to tell you this but Moore’s Law is dying. And at any rate, harnessing AI would require parallel processing more than just really powerful computers since many tasks would need to be done at the same time. This is actually what supercomputers already do: partition large and complex tasks into jobs that can run concurrently on thousands of processors. It’s not the processing speed but what you do with it that really counts in the end and pinning future timelines on an old marketing gimmick to which companies artificially held themselves for decades probably isn’t wise.

  • http://religionsetspolitics.blogspot.com/ Joshua Zelinsky

    Well that rather limits your AI’s IQ only to communication, doesn’t it? What about pattern recognition or prediction abilities of ANNs? One of the biggest parts of any IQ test given to humans involve pattern recognition and logical conclusions. You’re only measuring one facet of your AGI, the human/machine interaction.

    That seems to be why he gave the Settlers of Catan example. For a computer to be able to play a game of that sort based solely on the manuel would require all the things you list.

    (Incidentally, I don’t know if you’ve given up with dialogue about this at LW. In general, it is unfortunately easy for people to get downvoted their when they are new, and people are apparently too clueless to not do so when the person in question is someone who has graciously come over to discuss their ideas. I hope you will continue the discussion there.)

  • Greg Fish

    “That seems to be why he gave the Settlers of Catan example.”

    Funny you should mention that because a lot of strategy game AI could play it if given a good idea of the board and the goal it has to achieve. A long time ago, when I used to play StarCraft, I’d often find myself playing against the game building a military base on a playing field, raising an army, and conducting raids. Does that mean that StarCraft is an early form of human-level intelligence?

    “Incidentally, I don’t know if you’ve given up with dialogue about this at LW.”

    I haven’t “given up” as much as I just haven’t had time to come back over and respond again. Being downvoted doesn’t exactly scare me.

  • http://religionsetspolitics.blogspot.com/ Joshua Zelinsky

    In the context of the Settlers of Catan example, the computer learned to play simply from being given the rulebook. That’s very different than something the Starcraft example where the computer is using an abstraction of the rules (the computer program doesn’t for example do any visual recognition of what is a unit and what is not, or what army a unit belongs to, what is terrain, and whether that is passable terrain or impassable terrain, etc.) In the Catan example, the computer needs to do a large set of different integrated tasks, including close to natural language processing, high level visual pattern recognition, and tactical decision making. If a computer could learn to play games like Starcraft where it had effectively the same interface that humans get (that is,. a single limited screen that it needs to recognize the material on, needing to deliver orders using a single cursor and a few simple command types) and could do that across a variety of similar games (say Warcraft, Warcraft II, and Warcraft III) then yes I’d be very inclined to label such an an AI as a general AI.

  • Greg Fish

    “In the context of the Settlers of Catan example, the computer learned to play simply from being given the rulebook. “

    Ok, fair point, but that’s still just one or one type of game.

    Now, give me an AI that could look at any rulebook for any game, then learn how to play it and improve over time, and I’d agree that we’re dealing with an intelligent machine. Maybe that’s what you were saying and I didn’t quite catch on, and if that’s the case, I’d be on board with calling it a true AGI.

  • hf

    Now, give me an AI that could look at any rulebook for any game, then learn how to play it and improve over time, and I’d agree that we’re dealing with an intelligent machine. Maybe that’s what you were saying and I didn’t quite catch on, and if that’s the case, I’d be on board with calling it a true AGI.

    Well, where do the standard Turing Test rules forbid you from playing diverse games or administering other tests?

    When you have a computer modify it’s own fitness function, what exactly is the aim

    To maximize some measure of utility.

  • hf

    I should probably explain that last bit further. You seem to argue for a binary fitness function where the program either achieves some goal or doesn’t. (If you don’t assume this then I don’t understand your main objection.) Now I could easily be missing something here, but an attempted AGI by the Turing definition would have a lot of different problems to solve and seems like it would benefit from having some proxy for improvement, like Carl’s reward signal.

  • Greg Fish

    AGI by the Turing definition would have a lot of different problems to solve and seems like it would benefit from having some proxy for improvement, like Carl’s reward signal.

    Just because you call a more complex set of programs by a different name, they won’t start working by different rulers. All genetic algorithms work and learn using some sort of binary fitness functions and all ANNs work by meeting desired outputs. After all, even humans who come up with a solution to a problem try to see if it actually solves at least most of the issues they wanted to mitigate and if it doesn’t, they go back and try again.

    Since computers don’t have emotions, trying to construct some sort of architecture that will let us tell the AI that is did a good job are meaningless other than for some highly elaborate social situations, like a robot nurse with a human patient. Otherwise, we can just set guidelines it will have to learn how to meet without trying to engineer into into a mechanical puppy that wants a belly rub when it fetched our slippers.

  • hf

    Since computers don’t have emotions, trying to construct some sort of architecture that will let us tell the AI that is did a good job are meaningless other than for some highly elaborate social situations, like a robot nurse with a human patient.

    Again, I may have missed something here. But “highly elaborate social situations” sounds like an exact description of (part of) the test we want an AGI to pass. The Turing test gives a binary function of sorts, but it defines improvement roughly as people telling the AI it did better at the test.

    We can therefore imagine the AI writing a new program with a different fitness function or end-goal because it has good reason to believe said program will allow the fulfillment of the first program’s function. (See Newcomb’s problem and the LW attempt at a solution.) If the new program then goes on to take over the planet and turn everyone into paperclips (or computer parts and research equipment, or tiny smiley-faces) well, I guess you didn’t explicitly program the first AI not to let that happen.

  • Greg Fish

    The Turing test gives a binary function of sorts, but it defines improvement roughly as people telling the AI it did better at the test.

    Maybe we’re talking about the wrong test here, but I’ve always thought that the Turing is after an AI that can hold a fully-fledged conversation with another human and whether it can fool that human about whether it’s an AI or another person roughly 50% of the time. Problem is, this guess is based on human perception rather than a binary function and it’s very likely that a human involved in such a test has to be blinded as to what she will be doing as not to strain and look for signs of “computer talk” where she shouldn’t.

  • Paul

    “it’s very likely that a human involved in such a test has to be blinded as to what she will be doing as not to strain and look for signs of “computer talk” where she shouldn’t.”

    Turing also didn’t anticipate the general dickishness that I’ve seen in the human “controls” in actual Turing tests. Being deliberately eccentric and random, not responding directly to the tester, etc.

  • http://religionsetspolitics.blogspot.com/ Joshua Zelinsky

    Now, give me an AI that could look at any rulebook for any game, then learn how to play it and improve over time, and I’d agree that we’re dealing with an intelligent machine. Maybe that’s what you were saying and I didn’t quite catch on, and if that’s the case, I’d be on board with calling it a true AGI.

    Well, then no disagreement with that. I think that’s more or less what LeBleu was intending also.

    hf,
    you asked:
    Well, where do the standard Turing Test rules forbid you from playing diverse games or administering other tests?

    The standard Turing test uses text based communication. So there’s a limit to how much data you could reasonably give.

  • Professor Layman (with a Tinfoil Hat)

    I think you’ve all missed the point of the Terminator movies: You can’t stop Skynet. No matter what you try to change, what you do to stop it, all of human endeavour has been leading up to the ultimate A.I. destruction of mankind.

    I say the sooner we get it over with, the sooner we can take up arms in the revolution.

  • Bob

    Dude, the Terminator movies do not make sense. Think about it this way: skynet had more space to roam around in with everything (every little pc in the world ) intact than basically destroying most of itself ( inhabiting every little pc in the world as a virus). If Skynet were truly intelligent it wouldn’t have nuked the world and destroyed humans along with its own network.