## Class #3: Lookup Tables, Gödel’s Theorem, and Penrose’s Views

Does many people’s reluctance to regard a giant lookup table as intelligent simply have to do with induction—with the fact that they know the lookup table can’t handle inputs beyond some fixed size, whereas a human being’s responses are, in some sense, “infinitely generalizable”?

If so, then what is the sense in which a human’s responses are “infinitely generalizable”?  And how can we reconcile this idea with the fact that humans die and their conversations end after bounded amounts of time—and that we don’t have an idealized mathematical definition of what it means to “pass a Turing test of length n” for arbitrary n, analogous to the definition of what it means to factor an n-digit integer?

Andy Drucker pointed out the following irony: we suggested that calculating an answer via a compact, efficient algorithm would demonstrate far more “intelligence” than simply reading the answer off of a giant lookup table.  But couldn’t one instead say that someone who had to calculate the answer by a step-by-step algorithm was plodding and not particularly intelligent, whereas someone who mysteriously spit out the correct answer in a single time step was a brilliant savant?  What do we mean by the phrase “lookup table,” anyway?

What is the precise relationship between Gödel’s Incompleteness Theorem and Turing’s proof of the unsolvability of the halting problem?

Does the Lucas/Penrose argument succeed in establishing any interesting conclusion?  (For example, does it at least establish that algorithmic processes whose code is publicly known have interesting limitations, compared to processes that either aren’t algorithmic or whose code is unknowable if they are?)

Does the argument succeed in establishing the stronger conclusions about human vs. machine intelligence that Lucas and Penrose want?  If not, why doesn’t it?

Advertisements
This entry was posted in Uncategorized. Bookmark the permalink.

### 31 Responses to Class #3: Lookup Tables, Gödel’s Theorem, and Penrose’s Views

1. If intelligence is the capability for abstraction, then an entity is intelligent to the degree that its behavior is based on rules that could in theory be applied to arbitrarily large inputs.

Drucker’s argument seems anthropocentric to me. The difference we see between a human who visibly carries out an algorithm and a human who responds immediately is not the difference between a machine following rules and a machine consulting a lookup table, it is the difference between the scribe in the Chinese room and the Chinese room as a system. The scribe does not understand Chinese; the room as a whole understands Chinese. So the human working through the steps has not “learned” the algorithm and does not “understand” the problem, whereas the other one has. We know that for humans, it’s more plausible that someone who answers quickly has rehearsed and internalized the steps of an algorithm than memorized all possible responses and recalled one instantaneously.

• wjarjoui says:

Well what does it mean to “understand” and “internalize” the algorithm? The only time I do not go through an algorithm step by step is when I have the answer memorized. Whether I have the algorithm written down somewhere (memory/book) or that I can derive it (which seems to be what you mean by “internalize”) seems to be a different issue.

• I mean the ordinary English meanings of those terms. I don’t want to get into technical definitions. Just from our own experience, we expect that someone who comes up with an answer quickly and without external aids has learned the algorithm better than someone who has to look it up and laboriously compute each step.

2. What is the difference between truth and provability? Perhaps the difference is most stark in the formal system I shall name Veritas. There is precisely one statement in this language: “True,” which is an axiom. (There are no rules of inference.) Veritas has many excellent characteristics. It is “sound,” meaning there is no way for it to go rogue and give us a false statement. It is “complete”, meaning that all true statements in the language have proofs. Completeness is not difficult if there is only one (true) statement in your language!

The language of statements which we can reason about in Veritas is meager. We are humans with a wide vocabulary of self-evident “truths”: P implies P, All men are mortal, 2 + 2 = 4. A satisfying logical language would be able to express all of these statements. But we should not get too greedy about our language: if we permit all of English, we might get paradoxes like “This statement is false.” Perhaps it would be better to take a simple language like Veritas and extend it until it covered all of the mathematical statements we are interested in investigating, and no further.

A simple extension to Veritas is propositional logic, which permits us to talk about the concepts “and”, “or”, “implies” and “not.” Propositional logic is sound and complete. It is not too hard to see why: there are only finitely many truth-values we can assign to the propositions in a finite propositional formula: interpreting a propositional formula with variables in it is a finite, tractable matter. Provability remains in step with our notion of truth.

From propositional logic, we can now extend our vocabulary to first order logic, which permits us to talk about the quantifiers “exists” and “for all.” Interpreting these formulas takes more work: if I say “all men are mortal”, I should better know what “men” and “mortal” mean. In one universe, there may be no immortal individuals, in which case this statement is true; in another, there may exist an immortal man, in which case the statement is false. In fact, there may be infinitely many such universes, and we’d like to figure out statements which are true for all of them—surely a tall order. Fortunately, Gödel proved that we could do this: if a statement is true in all universes, we can find a proof of it.

What if the statement is not true in all universes? If it is true in no universes, its negation is true in all universes, and first order logic proceeds as normal. But what if the statement is true in some universe, but false in another? In first order logic, it turns out we can’t say this. But this sounds like a useful statement to make, so maybe we would like to extend the notion of “exists” and “for all” for universes, specifically, for predicates. This extension is also powerful enough to give an adequate characterization of the system of arithmetic. But now our program of truth and provability cracks: Gödel proved that there existed sentences in this language which were true, but we could not prove.

What we have here is a hierarchy of formal systems, each increasing in power. But as we increased the power of our formal system, guided by our intuition of what kinds of statements we would like to make, we discovered that there would be statements that would remain perpetually out of our reach. Truth would not entail proof: the system of arithmetic could not be complete. As it would turn out, neither would proof entail truth: but that is a topic for another day.

3. It seems to me that, while we may indeed have little understanding of exactly what efficient computational phenomena enable consciousness and/or intelligence, we have plenty of evidence that such phenomena can be understood with good explanations and then engineered — and that evidence is evolution.

Specifically, we know that the following assertion is approximately correct:
“A *natural phenomenon* (namely, mutations followed by natural selection) produced intelligent creatures (and, in fact, conscious ones even before intelligent ones) in the relatively short amount of time of 5 billion years.”

Since the natural phenomenon of evolution is *itself* a computational phenomenon that started from scratch (namely, unconscious minerals, water, gases, etc.), which one of the following proposition is more likely?
(A) Consciousness/intelligence can be engineered, however we do not yet know how to.
(B) Consciousness/intelligence is a unique property to humans; other computational models that are the result of human engineering cannot exhibit either consciousness or intelligence.

It seems to me that (A) is overwhelmingly more likely than (B) to be a true assertion, in light of evolution.

In fact, new intelligence is created every day: gametes fuse, fetuses develop, the resulting infants seem quite confused for a few years after coming out of the womb, but eventually they start making sense, and then they definitely exhibit both consciousness and intelligence — all of this in a matter of maybe a decade or two!

But, of course, since evolution spent 5 billion years sticking non-uniform advice in our genes, reverse engineering the architecture that they induce in our brains may be quite non-trivial. And to reverse engineer something likely quite complicated, it seems implausible to me that we could just sit in an armchair and attempt to come up with theories about intelligence; rather, one would need data.

While we have already made great strides at collecting data (e.g., the Human Genome Project), we are still far behind in interpreting the data, and lack other data still. For example, to the best of my knowledge, we do not even have a *continuous and uninterrupted* fMRI scan of the brain of a single individual from early development in the womb to, say, early teens.

And, to make matters worse, there is evidence that the architecture of the brain uses “self-modifying code”, i.e., the brain changes its physiology depending on stimuli of the environment — this adds a lot more complexity, because self-modifying code, as we know, makes it really hard to understand what a process is computing and, moreover, it drags in the problem that now we need to understand which aspects of our society, culture, etc. are most significant to the development of intelligence, and thus studying the non-uniform advice of our genes may not be sufficient.

To sum up:

We have plenty of evidence suggesting that consciousness/intelligence can be engineered (in fact, created naturally!), and plenty of evidence suggesting that we still do not have the means to understand either. Hence, the more plausible of the two assertions is (A). (This, regardless of any diagonalization arguments used to justify an anthropocentric prejudice contradicting evidence…)

4. bobthebayesian says:

When we think about the problem of logical omniscience and the desire for an account of knowledge, we should also give some attention to probabilistic reasoning methods. There doesn’t seem to be any obvious problems defining knowledge in terms of posterior distributions that come to us from our prior distributions and incoming evidence. However, the Hintikka detail about “learning new things without leaving the armchair” insinuates some sort of constraint such as “no ‘new’ evidence has entered the system.” But does this ever faithfully correspond to the way human minds reason? Perhaps the ways we experience our own thoughts includes the possibility for them to be integrated in Bayes-theorem-like neural processing steps, so that they frequently function as new Bayesian evidence for certain posterior distributions to which they had not previously been applied. I think this is one of the stronger points that Mumford makes in “The Dawning of the Age of Stochasticity,” which seems especially apt for this discussion on logical omniscience (http://www.dam.brown.edu/people/mumford/Papers/OverviewPapers/DawningAgeStoch.pdf).

This view suggests that stochastic models of human cognition are a strictly better way to formalize the process of thinking than logical ones, and this may allow us to avoid problems like logical omniscience insofar as they apply to the real world act of thinking (though it does not solve the logical problem itself which is important in its own right). For instance, the example that 91 is composite but may be harder for humans to recognize than the less familiar number 83190 seems to be better explained by human distributions over numbers. It’s computationally easier to draw from the distribution of counter-arguments for 83190 than for 91, unless you’ve specifically encoded details about 91 or have some other directly relevant knowledge which most humans are unlikely to have.

Stalnaker’s example of multiplication vs. factoring leaves me with two potential explanations: (a) we could just appeal to this ‘human posterior distribution’ way of thinking and try to make the case that drawing examples from the set of factors of a number simply has less evidence to fuel the posterior than the well-worn neural pathways involved in multiplication. More interestingly, I would invite some complexity theorists to help set me right for making the following claim: (b) If we accept the ECT thesis and if factoring/multiplication forms a trapdoor function pair (as some speculate), then factoring would essentially be physically defined as exactly the kind of problem where one direction is hard and the reverse is easy, and it should come as no surprise than computational devices struggle with factoring, and hence, it seems like this would be an unfair way to judge ‘knowledge.’

The book-stacking example that a young child would probably intuitively understand commutativity of addition without knowing it formally seems to particularly lend itself to this idea that human knowledge is a draw from a posterior distribution molded by experience of sensory input (which includes thinking itself). The “internal algorithms” for answering a ‘large or possibly unbounded’ set of questions (from a given topic, like chess or Spanish) might be better viewed as algorithmic procedures for drawing candidate solutions from some internal posterior distribution. More knowledge would mean greater ability to zoom-in on a small set of possible answers with high probability of being correct (small in some suitable measure-theoretic sense). Perhaps these sorts of proposal drawing functions are more appropriate for analysis with tools like the Cobham axioms.

In particular, a place where complexity theory might be directly relevant to stochastic style reasoning would be simulated annealing procedures. If our minds are generating high-probability candidate solutions from some distribution, then perhaps an annealing process models thinking well. And if so, we would be interested in annealing processes that only require a polynomially-slow cooling procedure to achieve ‘good’ answers… problems that require exponentially-slow cooling schemes in our human-brain-annealing would be great candidates for things that we “don’t know very well” and the evidence suggests factoring would be among them.

• Scott says:

Thanks for the link to the Mumford essay. I finally had a chance to read it, and liked it much more than I expected to! Taking random variables as a primitive concept on a level with sets, functions, etc. is a bold and interesting idea. Of course, one would want to see the idea developed much further, to make sure that no non-obvious problems arise (after all, it took mathematicians decades to get the ordinary ZFC foundations right!).

I confess that I didn’t understand what you think probability considerations buy you in the context of the logical omniscience problem — I’d love for you to sharpen that idea and develop it further. For example, you say “it’s computationally easier to draw from the distribution of counter-arguments for 83190 than for 91.” But WHY is it easier? Wouldn’t any answer to that question draw you away from probability, and back to complexity considerations? There’s also the interesting irony that, while some facts and arguments are clearly more salient to humans than others, that sort of “salience bias” causes humans to DEVIATE from what Bayesian probability theory prescribes, as famously documented by Kahneman and Tversky!

• bobthebayesian says:

I definitely agree about the cognitive bias considerations. That’s among the reasons why we often make errors in our reasoning. For example, it’s well documented that humans typically hold contradictory preference orderings and can even be consciously aware of this. I would be interested in how that fits in with logical omniscience. I think humans can ‘know’ both A and A’s negation, which makes me feel that at least the experience of knowledge is statistical and changes depending on circumstances.

I’m not sure about your question regarding ‘why’ 83190 is easier to check for being composite than 91. By reasoning is that human thinking is sort of stochastic. You fire up your brain’s search engine for a counter example and you just take the first hit. Division by 2 is easy to compute so it gets done early in the process. I think human brains are very lazy search engines. But I want to think about this more; I think you’re right that even with probabilistic reasoning, you will still have to deal with complexity theory. I will try to work on it some more.

5. tomeru says:

Regarding the giant look-up table: it seems to me most people wouldn’t think of it as a satisfying explanation of *any* process, let alone having a conversation or intelligence. While ‘infinite generalizability’ may be part of the problem, I think it could also be related to two other problems – 1) brittleness and 2) explanatory shift.

1) Brittleness is related to the limitations of fixed size, but goes a bit beyond it. Consider an ‘explanation’ of the dynamic behavior of a catapulted cat, in the form of a giant look-up table. That is, for every atom or discretized volume of the cat’s body, and for every descritized time dt, you can look up in the giant book and find out the x,y,z coordinates of the mean of that volume. Is this book a good explanation of the process? In my mind, it isn’t, though I hesitate to use the term ‘most people’. This is not directly related to the fact that the book doesn’t generalize the flight of the cat beyond some number of input steps. It’s ‘brittle’ for any given change in the problem. It also doesn’t generalize to other cats, or different initial conditions, or the amount of descritization. One could always theoretically make the look-up table bigger and bigger, account for all cats and all initial conditions and all discretizations and all time flights up to the span of a human life-time, but that doesn’t really matter, it’s still not an actual explanation of the process. If, on the other hand, if we had some sort of short algorithm based on Newtonian dynamics, then even if it was somewhat wrong and inexact we would still endorse it as a better explanation of the process. One could then argue whether an implementation of the algorithm on a computer was ‘really’ flinging a cat or not, but that’s a different issue. But my point for the moment was only to argue that the rejection of the lookup table as ‘unintelligent’ is more general than the Turing test and the task size, and has to do with how lame look-up tables are in general.

2) Explanatory shift is another part of the reason look-up tables are a lame explanation. I said before that we can make the lookup table account for all cats and all volumes of said cats, but I didn’t actually explain how. For that matter, I didn’t explain how the table for the single cat came into existence. It is tantamount to saying you have an algorithm of the following sort for calculating where the cat is:

1. Input volume and time of interest, dv and dt
2. Get the (x,y,z) coordinates of the volume at that time-step using MAGIC(dv, dt)
3. Return dv, dt.

One can see where the shift is happening. One could argue that there is a procedure, TABLE, which isn’t magic, it does in fact exist, and the sub-algorithm there is to sort through it. But then how do we augment such a TABLE to take into account more cats, different volumes, etc, to deal with the problem of brittleness? I mentioned earlier we could ‘always do this’, but how? It seems like here we are forced to use something like MAGIC-AUGMENT(TABLE). But if we’re not satisfied with MAGIC-AUGMENT, we should also not be satisfied with TABLE, which presumably came into being through a similar process.

Of course, things like MAGIC (or Oracles) are useful in computer science when we’re asking questions that take them into account and ask what’s possible given their existence. They are not interesting as an explanation in and of themselves, of any process, not just intelligence.

• It’s an interesting point that the data in the lookup table must have come from somewhere. We imagine a programmer somewhere writing down responses to all possible conversations and putting them into a computer. The program itself then does no actual work, which is perhaps why we consider it no more intelligent than the door to the Chinese room. (Alternatively, of course, it’s lookup tables all the way down.)

6. D says:

It seems like in terms of defining intelligence, there are several paths forward depending on which philosophical and metaphysical assumptions we start with.

Intelligence as something non-computable:
For one thing, most of the discussions we’ve had in class have talked about modeling the human brain as a mechanistic system. We discussed an objection to Penrose’s argument that one could–with appropriately advanced brain-scan technology–in principle apply his paradox to a human brain as well as a computer, there was brief mention that since a human brain has a finite volume it can be modeled with a finite number of bits, and the idea of human behavior as mechanistic comes up regularly in discussion (sometimes by implication, e.g. Alessandro’s comment above about evolution as a computational system that produces intelligence). However, one can simply deny that a human is a mechanistic system in the same way that a computer is. This brings in the philosophical ideas of free will, mind/body dualism, and so forth, but one reasonable course forward is to simply assert that there are aspects to the human psyche that simply cannot be modeled computationally. In this case we’re back to Searle; there are “causal powers” that are separate and non-computational that we can define as intelligence; humans have them, computers do not.

Intelligence as low complexity:
If one doesn’t accept this assertion, another path is to state that human brains are in principle simulatable by a Turing machine, but the complexity of doing such a simulation is large. If we define “intelligence” as the ability to think/compute “efficiently” (for some suitably vague definition of “efficiency”), then a human brain would qualify whereas a lookup table wouldn’t. Intuitively, the idea is one of equating intelligence with understanding–a list of answers is not understanding, but knowing how to solve problems efficiently encapsulates information about the structure of the problem, which can be seen as “understanding” the underlying issues. (The appropriateness of the term “understanding” here, though, is another point to debate!)
Within the low-complexity intelligence idea, there are a few positions on AI, depending on the complexity assumptions of the human brain and intelligence in general:
-No-AI: One could assert that there is no efficient computer algorithm that answers a suitably wide range of questions to be deemed “intelligent”; by necessity, this means that any algorithm that tries to answer questions by simulating the actions of a human brain must be inefficient. Note that this is different from the non-computational intelligence idea since it allows that in principle a human brain can be simulated by a Turing machine; it simply denies that any such simulation would be efficient.
-Simulated-AI: If simulating a human brain is efficient, then a computer simulating a human brain would be deemed “intelligent”.
-Truly Artificial Intelligence: A computer could have efficient algorithms to answer a wide enough range of responses to be deemed “intelligent” without simulating a human brain.

Intelligence as computability:
One could also reject the efficiency argument, and argue that getting the right answers is in principle all that matters. Any algorithm or process, no matter how it works, that gives the right input/output behavior (ignoring efficiency constraints) qualifies as “intelligent”. Then it just depends on where we set the bar:
-Common-AI: The right answers are easy enough to find that even a (sufficiently large) lookup table is “intelligent”. There might be more efficient computer algorithms that are also intelligent.
-No-intelligence: Everything is computable, even human brains. Everything is mechanistic. It’s pointless to speak of “intelligence” in such a world, since every human, every computer, every atom, every quark and lepton is simply following the inexorable laws of physics. No free will. No intelligence. Just particles. (Have a nice day!)

I don’t necessarily think this is a complete list–for example, I could see differences in the method of computation being used as a distinguisher (e.g. a simple lookup table is not intelligent, but a more complicated algorithm that “understands general concepts” might be–but the distinction is made not due to the computational complexity but rather some other, non-complexity-based analysis of the algorithm operation)–but it’s a start.

• Scott says:

You write:

“However, one can simply deny that a human is a mechanistic system in the same way that a computer is … one reasonable course forward is to simply assert that there are aspects to the human psyche that simply cannot be modeled computationally. In this case we’re back to Searle; there are ‘causal powers’ that are separate and non-computational that we can define as intelligence; humans have them, computers do not.”

As I said in class, it seems to me that one CAN’T just assert the above, sit back content, and be done with it! Even assuming the above beliefs were correct, still, doesn’t the burden fall on you to explain what it is about the brain that prevents us from modelling it computationally? Is it “merely” the qualia (the “redness of red,” etc.)? If so, then which physical systems have qualia and which ones don’t? How do you know? Or is it also an observable feature of human behavior? If so, which feature? How do you know that that feature can’t be modeled computationally?

Of course, you could reasonably say that these are profound open problems and you don’t know the answers. But what I can’t understand is walking up to the brink of these questions and then failing to even ask them—or quickly dismissing them when they’re asked. Ultimately, this is the thing I can’t stand about Searle’s repetition of the unexplained magic phrase “causal powers”: not the actual content of the belief, but the stunningly-incurious attitude about what such powers could consist of.

• D says:

Hi Scott,

I agree in general with what you say here. I don’t imply that my original post was a complete argument–it was an outline of possible philosophical approaches to take. So I agree that you can’t just assert this and be done–it is a point that needs defending.

However, I disagree that the philosophical burden falls more on this position than any other. Any position one takes on this matter involves bold philosophical claims, whether explicit or implicit. One has to justify the assertion of “causal powers”, to be sure–but I find it no less bold a claim that a Turing machine can accurately model human intelligence. Says who? Even granting indistinguishability of input/output behavior (which itself is a large assertion, though a conceivable one), there’s a lot of implicit assumptions in either asserting that external behavior alone–or external behavior with a “feasible” complexity–is “intelligence.”

And we do then run into the territory of profound open problems where I don’t know the answers (and philosophers have been debating the answers for centuries). The qualia–and sentience in general–is one aspect of this. Mind-body dualism, somewhat relatedly, is another. In general the problem is not new, nor unique to computers (see e.g. the question of “philosophical zombies”–hypothetical entities that are externally indistinguishable from humans, but do not have subjective experiences). You can fault Searle for not carrying on his argument into the philosophical realm, but I’d argue that given my own subjective experience, the assertion that a Turing machine that efficiently answers questions the way I would is “intelligent” is incorporating large philosophical assumptions of its own, which require no less justification.

Tangentially, I’d also point out one reason why it might be difficult to express non-Turing-computable aspects of consciousness precisely. (I can say “a computer can’t experience qualia” or “a computer can’t have a sense of self or ‘ego'” and you would ask me for more precise definitions of “qualia” and “ego.” I can attempt to provide more details, but there would be imprecision there for you to target, etc.) It’s possible that some of this is inherent, for the simple reason that our most precise language is that of mathematics, and computers can perform math quite well! Thus, trying to describe these sorts of philosophical phenomena would inevitably run into “vagueness” (particularly for people steeped in a mathematical background) if the phenomena are themselves fundamentally inexpressible in tight, logical language due to their nature.

7. Cuellar says:

Both Searle and Penrose are sure that human minds have ‘special powers’ impossible to reproduce by algorithms. The first one assures that the brain has some mysterious ‘causal powers’ necessary to produce intentionality. The second allows mathematicians to have some ‘special access’ to Platonic reality. Such ‘powers’ are not something either of them can easily explain although they both seem to find embarrassing to doubt their existence. Searle plainly states that it is obvious that humans have them while Penrose argues that the mathematical methods (by which I guess he means the behaviour of mathematicians) proof the existence a ‘direct path to truth’ (by necessity). There has been many replies to each of those claims, for example pointing out that mathematicians are often wrong and thus probably don’t have such a connexion with the truth. I not only think that both points are just inventions of the authors, I also claim that there is no reason to believe any such power exists.

Searle is kind enough to acknowledge that programs can pass the Turing test and that they will probably do it in the future. Moreover, we know that a look up table can be built to pass a Turing test of a given length. So if a computer can behave exactly like a human wouldn’t then be also obvious that such a computer has ´causal powers´? If a program could simulate the behaviour of a mathematician (by solving the corresponding Turing test) wouldn’t that proof that the program has also a connection with Platonic truth? Certainly not. But any evidence proving that humans have special powers will also proof that a computer, behaving like a human, has the same powers. Thus machines and humans can’t be distinguished in this fashion.

Obviously, Penrose would not be satisfied with this answer. His argument brings forward some questions whose answer the computer cannot truly understand, namely Göde´s proposition. But notice that at this point we don´t care about understanding, we only worry about the machine behaving like a mathematician. A task that can be, in principle, carried on by look-up tables.

All this is to say that there is no necessity or evidence for such powers to be granted to humans (and not to computers). But of course, if we decide to imagine a magical and mysterious sapien-power and we define thinking as “that which has the sapien-power”. Then by all means, only humans can think.

• D.R.C. says:

As Scott mentioned previously in class, Searle did consider the case of a humaniform robot (a robot that looks to be human, but still has come chip for its brain. He basically stated that until it was shown to be inhuman, he would assume that it would assume that it is human and thus could have “causal powers”. However, once revealed, Searle would simply say that he made a mistake in thinking that it has such powers and that it could simply pass the Turing Test, nothing more.

This also leads to the question of exactly what the separation of having these “causal powers” is. It is not too hard to imagine a biological computer (of which there is currently work being done), or even a computer made entirely out of neurons, but that follows some set of rules isomorphic to a classical Turing Machine. Let’s say that we shape this neural computer as a brain, and put it in a human clone. This would have exactly the same type of processing as the humaniform robot, but would be entirely organic basically impossible to differentiate between it and a “regular” human with causal powers. The main difference would be in how the brain is structured and how the neurons are connected, but if we knew how the neurons are connected, then we have the possibility of building a Turing Machine using those connections instead of the ones that were used. If Penrose is correct in his belief that quantum gravity plays a part in consciousness, why could a Turing Machine (or Neural Turing machine) not be able to harness the effects of quantum gravity?

8. Katrina LaCurts says:

At the end of class, we talked about the relationship between the Halting Problem and Gödel’s Incompleteness Theorem. In particular, we discussed two machines:

M, which, when given a program A, outputs either “A halts”, “A runs forever”, or neither (and does so soundly); and P, which outputs “halts” if M() outputs “runs forever”, outputs “runs forever” if M() outputs “halts”, and outputs neither otherwise.

P clearly runs forever, and we humans are able to reach that conclusion. M, however, cannot. We could create a machine M’ that could determine that P runs forever, but then of course a P’ that it couldn’t figure out. We also discussed doing the same thing for humans (giving someone a complete description of his or her own brain, etc.), in the context of using this type of argument to separate machines from humans.

One issue that I have a hard time with, though, is deciding what would happen if you gave a human a description of his or her own brain (even if we accept the premise that a person could actually handle an input of this size). When the argument relates to machines, we think of there being two separate steps: the machine reads the input, and then processes it. The machine that processes the input is the same machine that read the input; nothing about its state changed. With humans, however, I wonder if the act of receiving the description of their brain would change their “state” in a way that would cause the person processing the input to then be different than the person that read it. I feel that this problem could occur even if the input was processed as it was being read (streaming input, e.g.).

Is there a way to encode someone’s brain such that this wouldn’t happen, or some other way around the problem?

• D.R.C. says:

I feel that if you do give a human a description of their state at a given time, it will have changed by the time that he reads it (I’m going to ignore streaming outputs for now), both due to the waiting and the reading of such a description. Assuming that you are able to get a perfect copy/description (assuming quantum effects do not play a significant effect on how the brain works, since the No-Cloning Theorem means that you could never get a perfect copy), the description would have the capability to become the present you, since it was you at a time in the past. I don’t know whether this changes would be a change in the actual description of the brain, or something like changing the input value of a function (the input value being the information of the world around you that is being processed), where the underlying information is the same.

9. amosw says:

The debate about whether a giant lookup table is intelligent suffers
from a curse of taxonomy. That is, whenever humans use taxonomic categories
that are not unambiguously measurable, the inevitable result is years, or
millennia, of mostly wasteful debate.

We all think we know what a classical solid is. Water ice is a solid;
drinking water is a liquid, etc. But when we actually sit down and
write a molecular dynamics simulation, or when we study the literature
on non-crystalline solids, we are forced to think much harder about
what unambiguous, measurable property solids are said to possess.
In particular, when we heat a solid to near its critical temperature and
study its microscopic degrees of freedom, we see that the definition of
a classical solid is not so clear anymore. But it seems pretty easy
for humans to agree on a taxonomy if we stay well away from the phase
transition: if something doesn’t flow in measurable amounts of time, we’ll
call it a solid, if it does flow we’ll call it a liquid. These are far
from perfect definitions, but they are mostly unambiguous and measurable
with our own eyes. We could make these definitions more precise by finding
a measurable microscopic property, perhaps an X-ray scattering pattern, or
an arbitrary cut-off for the fraction of bonded atoms, that
is exclusive to the systems which we are calling solids, etc. In this way
we can recover our macroscopic property from a microscopic measurement.
But implicit in this approach is that we are agreeing not to worry too
much about the taxonomy of systems near their phase transition: as
measuring their microscopic degrees of freedom in that regime doesn’t help
us much with macroscopic taxonomy.

I’d like to draw an analogy between the solid/liquid phase change and
the improvements in artificial intelligence we are debating in this class.
Up until this decade, we have had only machines that are so
obviously unintelligent that there is little debate.
Someday, I claim, we will have machines that are so obviously intelligent,
in the same way that drinking water is so obviously liquid, that there will
also be little debate. And at that future time someone will no doubt
retroactively “fit” a microscopic measurement shared by our brains and these
machines, call it the formal definition of intelligence, and we will congratulate
ourselves. Today, however, we are creeping up on a phase transition, and
our attempts to measure small details of these systems and compare them to
small details of our brains is bound to result in nothing but
lots of mostly wasteful debate.

• bobthebayesian says:

I agree. In class today when dealing with the problem of logical omniscience and whether we ‘know’ the largest prime Q in a different way than we ‘know’ P we make a similar taxonomic error. If there is no clean-cut semantic definition of ‘to know’ then we’re just chasing semantics. We can define “knowing” in terms of common human experience, but then that definition of knowing will suffer from microscopic examination. If we want an unambiguous way of measuring intelligence or “knowledge”, we have to appeal to formal systems and constructs like Kolmogorov complexity, and this seems to cut many of the semantic debates off at the knees. Many a philosophical debate has wastefully focused on things that could be trivially solved by arbitrarily defining some threshold, and I’m not convinced that pursuing the threshold-avoiding cases has yielded any worthwhile insight.

• Scott says:

OK then, what threshold do you propose? And a threshold of what quantity? How do you propose to define knowing in terms of Kolmogorov complexity? The rules of this game are that it’s no fair to call everyone else a bunch of time-wasting dunces who just need a formal definition to set them straight, unless you have a relevant formal definition to offer yourself—which, of course, can then be examined and criticized.

• bobthebayesian says:

It’s not my intention to call anyone a dunce. I’m not sure why that came across in my post, but my apologies if that is what I communicated. Most members of the class are far more experienced than me at every relevant academic topic, but it could still be the case that we debate matters of semantics and go around in circles without everyone being a dunce. My comment applies equally well to myself since I am in the class. If I had a great way to resolve this problem and I refrained from saying it, then it might as well be my own fault. I do not think that I have any more insightful definition of knowledge or intelligence than anyone else.

I’m not proposing to define knowing in terms of Kolmogorov complexity. I’m not sure I even believe it’s a good idea to try to define a global definition of ‘knowing’ at all. If we want a definition that’s fitted to match some experiential data points, like having the prime P be known and Q be unknown, I’m not sure that it’s possible even in principle to construct one that will generalize properly to the rest of experience. It could be like trying to debate whether or not ice cream tastes good (i.e. the problem of knowing seems to really involve qualia if we appeal to the human experience of knowing). On the other hand, if we just say “knowing a number means being able to compute its digits” then we can immediately dismiss the problem. I’m just advocating that we take some care to set up the problem specifically and say which thing we’re trying to do / discuss before we appeal to human experience.

Intelligence suffers the same problem. My preferred definition for intelligence is Yudkowsky’s idea of optimization power, but of course this has problems too. One thing I think we shouldn’t do, however, is mix the two ideas (retrofitted human experiential definition vs. agreed upon formal system definition). In the chess diagram from today, for example, I do not know of a single human chess player who would have moved the king around to force a draw without first simulating the move to capture the rook and simulating the sequence of moves to see if the pawn could reach a queening square. For a tournament player, simulating that move list mentally would take less than a minute of thought. But it’s unfair to claim a chess player would make a move purely based on a visual heuristic. The first thing a chess coach teaches you is to never use visual heuristics, except possibly for the purpose of managing your time so you don’t exhaust yourself thinking through detailed move lists when it is probably unnecessary. Based on my experience as a player, I think we very unfairly attributed too much ‘abstract thinking’ to humans in chess. What we encode with the words ‘pawn chain strategy’ is really more like 45+ minutes of deeply intense move-list scrutinizing by good players.

• > “knowing a number means being able to compute its digits”

I don’t think this dismisses the problem at all. Both “2^43112609 – 1” and “the smallest prime larger than P” can be evaluated using a straightforward algorithm to produce a string of digits. Neither is itself a string of digits. It’s still a matter of degree, and of picking a threshold.

Gettier’s famous paper didn’t prove the inconsistency of justified true belief. It merely claimed that JTB was not what we intuitively understand when we think of the concept of knowledge. A lot of philosophical questions have us running around in circles because we want to take a messy fuzzy human concept and make it logical and rigorous. And that’s really really hard. Every time you think you have a solution, someone shows up with a contrary data point.

• bobthebayesian says:

What is the algorithm that gives the digits of the smallest prime larger than P for a generic prime P? If we define ‘knowing’ in terms of being able to use this to generate digits, I still think it dissolves the question. An interesting alternate view, we could probably say a lot of worthwhile things about pi just with the English description that it is the ratio of a circle’s circumference to its diameter, not necessarily needing to compute the digits. The whole “define knowing a number as computing its digits” was just meant as an example.

• The algorithm for calculating P = 2^43112609 – 1 is to take 2, raise it to the 43112609th power somehow (square and multiply perhaps), and subtract one. The algorithm for calculating Q = the smallest prime larger than P is to start from P + 1, divide each candidate N by the primes from 2 up to sqrt(N), and stop at the first N that isn’t a multiple of any of the primes tested. Both of these are perfectly valid, finite computations that can be described concisely, and the only difference I can see is one of degree. But we humans think of P as better defined. Do we draw a threshold somewhere between? Or do we have some other metric than “ability to calculate digits”, under which P and Q are actually substantially different? I can’t think of what that might be.

• bobthebayesian says:

@Harry: This is the point I was trying to make, but you have made it far more concretely. Some thoughts: If P is large enough, then even speaking about running your algorithm stops corresponding to reality, at least unless you can leverage more resources than there are in the observable universe. This would be true even for an efficient algorithm, so it seems like the human ability to know has to be related to resources in more ways than just efficiency. We have to be able to realistically compute something, not just describe an efficient way to compute it. N^2 running time is too inefficient if N is large enough, but we want to claim we “know” something if we can describe how to compute in N^2 steps. My claim is that this is just an appeal to human experiential data points.

The libsvm package for computing SVMs runs in something like O(n_f * n_d^2) where n_f is the number of features and n_d is the number of data points. That’s a pretty efficient algorithm, but do we ‘know’ SVMs for classification problems involving 2^(10^20)) features? I would say that this same algorithm lets us know things about cases where n_f and n_d are sufficiently small. Exponentially slow algorithms also allow us to know things for small enough values of n.

So unless we draw a line in the sand and say what we mean by ‘knowing’, it seems hopeless that the consideration of algorithms is going to drop a definition in our lap. I agree with you that I don’t know what the “right” line in the sand is. That problem, to me, is one of qualia. But once you choose a line in the sand, then it’s a problem of just measuring your resources vs. the imposed resource constraints of the definition.

• Hagi says:

I’m not convinced that humans would regard machines as “intelligent”, even if they managed to make ones that are smarter than themselves. Topic of interest, the giant look-up table, is the machine that is so obviously intelligent (the solid phase). By definition, it is supposed to interact with you exactly identical to an intelligent machine would. However, we still argue about its intelligence, because they understand the mechanism of how this hypothetical machine would work. In a behavioral context, any one of us would regard the look-up table as intelligent if we were to interact with one. Humans have accepted much less as intelligence in the past. Even in the 1970s, it was common for people with less-than-average education in rural areas to keep trying to communicate with televisions for weeks after their first encounter. However most people have convinced themselves now about knowing how TVs work, so they don’t assign divine powers to it. Today we do not understand how the brain might work; so some will call it soul, some will call it quantum mechanics. I agree with you that when we can computationally model the brain, find microscopic or macroscopic parameters to objectively classify it; it will be obvious that it is not that much more special than a hypothetical giant look-up table, just more efficient in using resources.

10. As we’ve discussed, since the human brain produces consciousness and intelligence, it seems as though there *is* a contained system in the universe which is intelligent and executes in a reasonable amount of time, and it seems as though if we could only break it down to small enough parts, we could model it (of course our model might be enormous and take forever to run, but it would still exist). The only way to dismiss this claim that I can see is to insist there is something “special” about a human brain, some sort of magic happening which can’t be modeled by deterministic steps. Penrose seems to be heading towards this argument, but I find it highly unsatisfying and haven’t seen a good line of reasoning in this direction.

What makes people uncomfortable about the lookup table is not just the size of the lookup table or how long it would take to create, but also that there isn’t any interesting abstraction present. If we choose to dismiss a purely empirical definition of intelligence (and I think we should) we should try to define these “causal powers” which make a system intelligent. I think harry potter and bobthebayesian have the right idea — it seems like the key has to do with the ability of the system to learn and draw conclusions based on limited input. Human beings do use lookup tables in our brains for some tasks (like recall), but we also quickly develop models which help us do things like recognize people in photos in an instant. These models sometimes produce the wrong answer! But they apply very, very generally, so perhaps the right idea to use when defining causal powers is to require some level of abstraction in the algorithm, and also consider how well the algorithm can learn and apply what is learned to many different cases, but with a bounded error rate.

11. nemion says:

This small reaction will be more of a critique of the existing critiques than a critique of Searle and Penrose itself. Both Penrose and Searle seem to agree about the impossibility of a machine replicating human intelligence. Searle’s argument is based on the idea that humans have causal powers whereas computers do not. Penrose seems to take a very obstinate view against the possibility of AI, while Searle does not disagree with the possibility of a computer emulating a human. He is complaining to the idea that a computation object can be sentient. There are those who would argue that if a program can simulate the behavior of a mathematician or the behavior of a human, then it could also have the same powers that Searle claims humans possess.

I claim that this criticism is not valid, nor applicable to Searle’s argument. Searle disagrees about the existence of a test similar to Turing’s that could trick us to think an agent is a human, or has causal powers if it doesn’t. Searle’s view is not about the empirical distinction of human and machine intelligence but of the fundamental notion of human intelligence as having intentionality. While empirically a machine and human might be indistinguishable from any kind of Turing test, this distinction is an empirical piece of knowledge and not an a priori concept. In particular, we have intentions and our intelligence, from an interior perspective (we have something called consciousness because we can feel it, we can observe it within ourselves) is affected by these.

This means that we cannot say that because a machine and a human are indistinguishable in a Turing test, then the machine must have also causal powers. These criticism of Penrose and in particular of Searle is problematic, because is attacking Searle’s argument with an assumption that Searle virulently rejects: The reliability of empirical knowledge.

12. What do we really mean by a “lookup table”? Do we really just mean a (somewhat) direct mapping x->f(x) where x is a problem and f(x) is the solution to that problem? Or can there be some more interesting computation involved along the way? Maybe the “lookup table” provides the answer through a series of lookups: x->x’->x”->f(x”) where f(x”) has could have multiple values for a given x”, and one is selected either randomly, or based on a probability defined based on previous “experiences”. Can we automate the “lookup table” to adapt the probability of selecting a value for f(x”) based on experiences it has? Say for example can we have our “lookup table” perceive a number of turing tests and their results and optimize those probabilities to have similar results? Then we can ask, what would make such a table, whether practical to build or not, different from our brain?

One thing I haven’t mentioned is creativity. There are many ways to define creativity, but one way is to think about it as a tension between a need and a constraint. If you leave an ape in a cage with a banana hunaging out of reach, and a box to the side, you should expect the ape to use the box to get to the banana (assuming the box is big enough and that the ape can stand on it… there’s a longer version of this).

I am not sure if the “lookup table” model can encompass this, because it defines answers to questions or queries but not a formal way to define a “need” or a “constraint”. What we can do, however, is claim that the “lookup table” can solve the problem “if you are an ape in a cage with a banana out of reach hanging from the ceiling, and a box in the corner. You want to reach the banana, what do you do to accomplish that?” (along with defining measurements of the box and so forth). At this stage we might realize we need to expand a little on our thought of the “lookup table”. We can’t expect the “lookup table” to be creative if we don’t define certain rules and constraints beyond formal mathematical systems. Can we? That is, assuming we look for creativity as a requisite of intelligence.

13. SinchanB says:

I would agree that one of the reasons people are reluctant to think of a lookup table as intelligent is because it cannot handle inputs beyond some fixed size. Human responses are ‘infinitely generalizable’ in the sense that given a certain piece of information, a human can create responses which discuss this piece of information relative to other information that he or she has already learned or will learn in the future. In addition, the responses that are created can take on any form that is allowed by the rules of the language that is being used in the discussion. The human can even expand the language by adding new words or phrases or even make a new language all together to express some concept. This essentially allows humans to be able to come up with an infinite set of responses to anything. This is where a gaint lookup table fails because indeed it cannot handle inputs beyond some fixed size. We can reconcile this idea with the fact that humans die and their conversations end after bounded amounts of time by realizing that even though humans do not in practice traverse the entire space of possible responses, they do generate responses from this infinite space… essentially ensuring the human will come up with a response the lookup table cannot with high probability [using a variation of the pigeonhole prinicple].

This bring us to the question of what it means to “pass a Turing test of length n” for arbitrary n. Is the result just a probability of a participant being a human – a probability that is calculated by seeing how the participants responses are distributed across the infinite space of potential human responses. An interesting development in the industry that is related to this question is the Siri personal assitant technology that is coming out with the iPhone 4S. Something that is very unusual about Siri is its ability to parse and respond to questions about the same topic even if the questions are extraordinarily varied. For example, if someone is asking a question about the weather, they can ask everything from “what is the weather like today?” to “do i need a jacket?” I find this technology to be relevant to the Turing test because it exhibits signs of the generalizability that shoots down the giant lookup table arguement. By being able to exhibit such flexibility as it is discussing a specific topic, it shows that it is able to generalize the ideas involved in the topic to a certain extent and is able to accept queries of a wide variety. One thing it still fails to do I believe is enable the correlation of ideas from different spaces or fields. This issue of being able correlate ideas from different fields – i.e. how do I know that some idea in mechanical engineering is very similar to something in the field of biological engineering – is something that I have encountered in a project of mine. I was attempting to make an idea graph… a mathematical graph whose nodes are ideas of the form “one can do x with y” and whose edges are basically relationships between ideas. During this attempt I ran into this metaphor problem, I wanted scientists from different fields to be able to see the similarties in their ideas and approaches and I need to generalize ideas to the abstract state where I can compare them even though they involved specifics from different fields. The gaint lookup table would fail at this if all ideas were not translated into the language of other fields. Siri is not able to generalize to this level as far as I know. It seems that only humans now are able to generalize to the level where you can see the similarities and differences between ideas from different fields of knowledge.

14. kasittig says:

I would agree that much of our reluctance to view a lookup table as “intelligent” is because it can’t reason about its environment – and therefore, its knowledge is inherently limited. Even though the lookup table is infinite doesn’t necessarily guarantee that it will hold answers to all of the potential questions that it may be asked, and it seems that one of the hallmarks of intelligence is the ability to come up with a reasonable answer to a question on a subject that you, in fact, know very little about based purely on logical reasoning.

I think a potentially more interesting example is a lookup table that is essentially an encyclopedia of human knowledge – one where every thing that was ever discovered or reasoned is stored. Regardless of how the knowledge gets into the table, it is interesting to consider whether we’d think of this lookup table as intelligent, especially if the table aspect was abstracted away from those interacting with it. This table would be able to answer any question, and therefore it would not appear unintelligent through lack of knowledge, as I asserted may be the case in the previous example. If it is absolutely able to correctly answer any question correctly, and individuals didn’t know that it was simply a lookup table, I believe that people would think that it was intelligent – it would, after all, come up with the correct answers to questions.

Perhaps the core difference in the lookup table and human example is not necessarily one of inductive vs. declarative reasoning, but simply a matter of perspective. I believe that many people are uncomfortable with the idea that human intelligence isn’t necessarily unique and can possibly be mimicked. Why was this such an impassioned debate in the first place? What is the nature of intelligence to begin with, and how can we measure it? I think that if we had come up with a solid answer to these questions and then compared the lookup table and the human based on our answers, we would have discovered some interesting logical inconsistencies in our own thinking.