## Class #1: Introduction and the Church-Turing Thesis

Why should scientists and mathematicians spend time thinking about philosophy?

Why should theoretical computer science be of any particular philosophical interest?

What is the Church-Turing Thesis: a mathematical definition?  a thesis about human psychology or concepts?  an empirical claim about the laws of physics?

If (as current physics seems to suggest) there’s a finite upper bound on the amount of computation that can ever be done within the observable universe, then how should we even make sense of the Church-Turing Thesis—which talks about the computability or non-computability of infinite sets (and treats every finite set as computable)?

This entry was posted in Uncategorized. Bookmark the permalink.

### 61 Responses to Class #1: Introduction and the Church-Turing Thesis

1. Just a test comment

2. ezyang says:

Constructivism is a broad philosophical program in mathematics, encompassing intuitionistic logic and finitism. These distinctions are useful for classifying various attitudes towards the Church-Turing thesis: they range from the classical: “I don’t know what this Turing Machine looks like, but it exists!” to the finitist: “Turing Machines do not truly exist, because they require an infinite tape.” I’d like to briefly elucidate some of the boundaries between these attitudes.

In Sipser’s textbook, the following program is presented as part of an exercise:

> f(x) = 1 if God exists, 0 otherwise

The function is in fact computable (despite it’s dependence on the seemingly unknowable question of God’s existence), via a simple logical argument:

> Is the proposition “f(x) is computable” true? By law of excluded middle, either God exists or God does not exist. If God exists, then f(x) = 1, which is trivially computable. If God does not exist, then f(x) = 0, which is computable. Therefore, by disjunctive elimination, f(x) is computable.

The key step in this argument is the appeal to the “law of excluded middle”: a user of intuitionistic logic would demand that we know for certain whether or not God exists. Without this knowledge, we fail to actually construct a machine that fulfills this specification, making this proposition unprovable under the program of constructivism. Nevertheless, computation theorists gleefully posit the existence of such machines, and in many cases, it doesn’t seem too far a stretch to imagine that they in fact do exist. f(x) = 1 if the Goldbach conjecture is true and 0 otherwise: if we believe in the ability for humans to decide this question one way or another, at some later point it would be quite easy to build this machine.

Mathematical finitism is an orthogonal concept, which states that a mathematical object cannot be constructed if it cannot be constructed in a finite number of steps. In one sense, Turing Machines fail to be finitary, as they posit the existence of an infinite tape upon which the Turing Machine works. But in another sense, for any finite amount of time, a Turing Machine can only consume a finite amount of tape, making it a finitary construction: if we run off the tape, we just find some more tape to feed the machine. Only if we permit a Turing Machine to run for an infinite amount of time do we truly need infinite space.

3. Scott says:

Thanks, ezyang! Since we’re just getting started, I’d like to say that the above is a perfect example of a reaction essay.

For whatever it’s worth, denying the law of the excluded middle always seemed to me like a cumbersome way of talking — one sharply at odds not only with computation theorists’ usual linguistic conventions (as ezyang correctly points out), but also with ordinary people’s! Why not simply say that either P holds or P doesn’t hold, and we don’t know which one? Doesn’t that convey the same information in a less tendentious way?

• bobthebayesian says:

I don’t consider this to be my reaction essay, but felt it was useful nonetheless (at least for me). I know the basics of quantum mechanics and the math isn’t too hard, but I greatly lack the physical intuition. My intuition is to claim that quantum superposition appears to be at odds with the law of the excluded middle. What if God occupies a state that is a superposition of existing and not existing? It could be the case that P occupies a state somewhere between holding and not holding (at least if we’re making statements about the physical world, which seems to be a relevant subset of all things computable). It might be more informative in some cases to look at things as if the law of the excluded middle did not hold.

• My experience with QM comes from skimming Wikipedia articles, so “Lies! All lies!” is an acceptable response here, but my impression of quantum superposition is this: every particle exists as a separate probability distribution. When two particles interact, each appears to the other to have a definite state in that instant, sampled from its distribution. Since most particles are constantly interacting with each other, on the macroscopic scale there is the semblance of a consensus reality. The answer to Sipser’s exercise is a mathematical trick, but presumably if one were to build such a machine, one would need at some point to measure God’s existence, and that observation must result in a definite answer, true or false. If you try to put in a probability distribution for God’s existence at one end, you’ll only get out the same distribution for the machine’s output, because all you’re doing is putting a separation between the machine/God system and you the observer, where before it was between the machine/you system and God. When you observe the machine’s output, the system collapses into one of two states: God exists or doesn’t exist, and the machine returns 1 or 0 correspondingly.

I am leaving aside the tricky theological question of how to go about observing God’s existence. It may be that replacing that with some more traditional example leaves the argument intact.

• Scott says:

Hi “bobthebayesian,”

There is indeed a field called quantum logic, in which one can talk sensibly about a quantum state |ψ> not having a property P (in the sense that doesn’t belong to a subspace corresponding to P), but also not having property not(P) (in the sense that it doesn’t belong to the orthogonal complement of that subspace). In more conventional language, |ψ> would be in a superposition of the P and not(P) states.

On the other hand, Bohr famously pointed out that, even to believe in the first place that quantum mechanics is correct or to use it for making predictions, we have to have some prior understanding of classical concepts. (And FWIW, I completely agree with Bohr on this, even though I disagree with him on many other things!) In other words, the only way we ever test QM is by making measurements, and a measurement in the {P,not(P)} basis will indeed reveal either one outcome or the other. For this reason, I think that even in a quantum world, it still makes perfect sense to talk in terms of the Law of the Excluded Middle (recognizing, of course, that the logical negation of “|ψ> satisfies P with probability 1” need not be “|ψ> satisfies P with probability 0”!). In other words, if we want to claim that “rules of logic can be overturned by discoveries in physics,” then it seems to me that we face an infinite regress, in explaining in how we can make physical discoveries in the first place, if not by using logic! 🙂

4. bobthebayesian says:

In reading about Searle’s Chinese room argument, I believe there is an additional relevant manner in which Complexity Theory plays a role. On the Wikipedia page for the Chinese room, there is a decent summary of some of the critiques of Searle’s argument, and one of them is the ‘system’ critique. This essentially says that when you put the man and the system of papers, pencils, algorithmic instructions, file cabinets, etc., into the room together, then the whole system has the property of understanding Chinese, even if the man himself does not. As it is noted there, Searle’s retort to this criticism is as follows: suppose the man just memorized the algorithm and performed all the mental manipulations in his head rather than with any external aides. Searle argues the man still does not “understand” Chinese.

If the man has enough cognitive resources such that he can output responses that are both sensible and (crucially) efficient, then what physical experiment could be conducted on the other side of the door to distinguish this man from some other person who does “understand” Chinese? If no physical experiment exists to decide the difference, then practically speaking, the man does understand Chinese as far as anyone else is concerned.

More interestingly, though, suppose that there was a difference between the efficiency with which the man can compute Chinese responses in the first case (where he has pencil, paper, etc.) and the second case (where it is all mental math). To the people on the other side of the door, these two scenarios would appear very different. If they submitted n Chinese characters, then perhaps the mental math approach requires O(2^n) running time before a sensible response comes back under the door; maybe with pencil and paper it is only O(n^2). To me, this is exactly what the other papers (Levesque, Shieber, and Parberry) are getting at when they demand (or allow for the possibility) that the algorithm which passes the Turing test be more elegant than just a look-up table. If the process the man uses has an inefficient running time, this may not actually pass the Turing test. Real life Turing tests, as WPSCACC mentions, are often confined beneath say 10^20 bits, or less than 5 minutes of real time, etc. Searle’s premise that you could even construct a room that could hold the appropriate amount of file cabinets and paper, or that a human being could manually execute the algorithmic steps necessary, rely on tacitly made assumptions about complexity.

My knowledge of neuroscience is confined to specific issues about computer vision processing in the mammalian brain. But from what I know about that, it seems reasonable to me that the process for selecting an efficient and sensible response to some given input text data is not much more than a glorified hash table anyway. Suppose person A can speak Chinese and that we would all agree that A understands Chinese. Then, when we give Chinese characters to A by providing visual text data to A’s visual system, how is that different than sliding Chinese characters under the door? And when A’s brain processes the Chinese characters and efficiently selects an appropriate response, how is that different than manually following an algorithm? Occam’s razor suggests to me that the onus for providing neuroscience evidence that the above example fails to be algorithmic should fall on opponents of strong A.I., like Searle. As far as I am concerned though, even if I learned tomorrow that all of my own cognitive processes were “merely” very complicated queries in a tiered look-up table or graphical model, this would not cause me to believe that I did not “understand” things. It almost seems like Searle is reaching to define “understand” to specifically exclude strong A.I., and hence also almost all human cognition (assuming neuroscience continues to support this view).

• bobthebayesian says:

I would also welcome any further explanation of Shieber’s discussion of abductive reasoning. Is there abductive reasoning which does not resort to something like a zero-knowledge proof or statistical claims? And, in this situation, is it appropriate to think of a zero-knowledge proof as equivalent to a typical classification problem from machine learning (e.g., you’re seeking the ‘best’ explanation for a series of passages of the Turing test… what can ‘best’ mean other than maximum likelihood, MAP, or something similarly statistical?)

• Searle’s argument, like (unfortunately) many others in philosophy, is essentially that because our vague notions of an ill-defined concept, in this case understanding, do not recognize a logical extension of that concept, the extension must not actually be analogous. It is trying to define understanding simultaneously through instinct and rigor, to appeal to the idea of understanding as this thing that humans do. But if consistency is at all a virtue, then the room understands Chinese as surely as a native speaker does, and to deny this fact out of a failure to accept the manner in which it does so is mere parochialism. It is rooted in the age-old desire to see humanity as privileged and superior, elevated above the rest of the universe on a golden pedestal that science has systematically demolished. A human is no more or less than a system or a machine, which is to say not that its emergent processes have no worth, but that they are still emergent processes, and no more special than any other.

• bobthebayesian says:

Personally, I completely agree with the point of view you describe and this criticism of Searle’s argument, and this is why Shieber decides only to deal with Block’s criticism which is more systematically founded. But it still seems like we need to invoke complexity to address metaphysical claims that essentially say that consciousness or understanding are intrinsic, special, inherently undefinable aspects of human minds. Complexity forces us to say that if the Chinese room doesn’t “understand”, then neither do actual humans, unless there is a dramatic paradigm shift in neuroscience to suggest that algorithmic complexity is not suitable for modeling brain processes. It’s similar to Sam Harris’s approach to religious arguments. Once religious arguments make claims about testable and observable aspects of reality, then science bears directly on them and to continue to embrace the religious claim == to deny the scientific claim. Complexity plays a similar role with consciousness. To invoke special privileges about what a brain does and to claim it to be outside the realm of science is now directly at odds with complexity analysis of neuroscience-inspired algorithms. It’s no longer a tenable position to claim special privilege about human thought and also attribute credibility to neuroscience. I mean, in the limiting case, a “Turing test” could be sensible responses to a stream of verbal input *plus* sensible fMRI results while giving the sensible responses, etc. Block and Searle both insist that there is some intrinsically internal portion of thought, but my view is that experimentally, there is no such thing as “internal.” Everything about a Turing test transaction can be made external as part of the conditions of the Turing test if one wishes.

5. Currently program authors can write themselves into a hole, so to speak — the most common languages we use enable us to easily write programs that will never halt. A simple construction is:

while (True):
print “I run forever”

But let’s consider a restricted language which by definition can only create programs in which we can prove if they halt or not. The above program could be written in this language; it’s easy to tell that it will never halt.

What are the implications of such a restricted language? Are there interesting programs that we will never be able to write, because the language isn’t rich enough? It’s important to note that it’s not the case that all programs written in this language will eventually halt; a program may not halt, but we should be able to analyze it and determine as such. Let’s call the program we’d use to do that M — could M be written in this language?

It’s possible that this language collapses down to something well known or obvious.

• Scott says:

Hi Neha,

You raise interesting questions! However, one crucial issue you left out is the input to the program: can the proofs of halting or non-halting vary with the input, or must they be “uniform” (in some sense) over all inputs?

To avoid this issue, in what follows let me consider programs that don’t take any input; maybe someone else will be interested to generalize the discussion to programs that take input.

First observation: given any fixed program P that halts, there’s a simple proof that P halts—namely, run P for however many steps it takes to halt! So, we could imagine a programming language L that only allowed
(1) the set of programs that halt, and
(2) the set of programs that run forever, for which there’s a proof (say, in ZF set theory) that they run forever.
But then an immediate problem is that it’s undecidable whether a given program belongs to L or not! So compiler construction will be a challenge. 🙂

So, suppose we also demand that there exist an algorithm to decide whether or not a given program belongs to L. Also, let g(n) be the largest number of steps taken by any n-bit-long L-program that eventually halts. Then I claim that there must exist functions f(n) that are “computable” (in the usual Turing sense), but that exceed g(n).

PROOF: Consider a program that, on input n, enumerates all strings of length n; decides which of those strings are valid L-programs; among the ones that ARE valid L-programs, decides which ones halt (by enumerating all proofs of halting and non-halting); by running all the programs that halt, computes g(n); and finally, outputs f(n):=g(n)2 (or f(n):=2g(n), etc).

We conclude that the language L doesn’t “capture” all the computable functions.

(Though as we mentioned in class, L could certainly capture the primitive recursive functions: a proper subclass of the computable functions that encompasses almost everything that anyone cares about in practice.)

• Vijay Ganesh says:

Hi Neha and Scott,

In formal methods and program analysis, there are researchers who are already working on such languages (called domain-specific languages), wherein, a certain property is “baked into” the language (e.g., all programs written in this language L will halt, where the property is “Program will halt”).

There are 3 objections to such languages. First, such languages are typically not Turing-complete, and hence there are many computations that cannot be written down as L-programs (Scott already pointed that out). Second, domain-specific languages can be a straight-jacket in practice, i.e., difficult to use and limit creativity. Third, except for a few choice examples where domain-specific is a big win (e.g., type-safe languages), the concept may not extend in a natural way to arbitrary properties you care about.

Domain-specific languages are slightly different from Neha’s proposal, in that, programs that negate a property are not allowed by such languages. Neha wants a language where programs can have either property P or notP, and we should be able to prove it either way. We can already do that using semi-automatic theorem-proving for Turing-complete languages. The only problem is that semi-automatic theorem proving is really difficult, and to-date largely impractical except certain special cases like hardware sub-systems.

• bobthebayesian says:

Depending on what one means by “interesting programs” in this context, the incompleteness theorems may be relevant. If the language L used to write the programs has the guarantee that programs written in L must have proofs about whether or not they halt, then L cannot have much “interesting” ability to describe mathematics. I’m not sure how this exactly relates to whether the programs are interesting, but seems worth consideration. By incompleteness (and making a liberal working definition of “interesting”), if a language can express interesting things then it must also express undecidable things.

6. Qiaochu Yuan says:

The comparison of computers to toasters that was made in class interests me, as did the response to that comparison. A general criticism that can be made of attempts to draw philosophical meaning from fashionable topics is that those topics vary, naturally, by the fashion of the era: now we talk about computers and philosophy, but perhaps 400 years ago we’d talk about clockwork and philosophy, and perhaps 1300 years ago we’d talk about gunpowder and philosophy. (I first encountered this criticism in the context of metaphors for the brain: now we say that the brain is a computer, but we used to say that the brain is a telephone switchboard, or….) So how do we know we’re not just talking about computers and philosophy because computers are fashionable?

One response that was eventually produced is that computers exhibit a qualitatively new feature that past technology didn’t: universality. A computer is, in some appropriate sense, supposed to be able to eventually simulate any computation (modulo space restrictions, etc.). This isn’t true of toasters or clockwork or gunpowder, and if we believe that any reasonable physical process is a computation (which I guess is a falsifiable, empirical form of the Church-Turing thesis), then computers and their study are genuinely philosophically unique. Studying the limitations of computers should teach us something about the limitations of physical processes, and studying the capacities of computers should teach us something about the possibilities for physical processes (for example, as in Valiant’s theory of evolvability, what capacity physical processes have to produce self-replicating organisms with various features). I would hope that at least some philosophers find such questions interesting!

• Remote Follower #0 says:

> “So how do we know we’re not just talking about computers and philosophy because computers are fashionable?”

I would claim that we are. Also, computation theory really is the pinnacle of a wide array of past topics.

>”One response that was eventually produced is that computers exhibit a qualitatively new feature that past technology didn’t: universality. ”

Or rather, say, “intentional universality”, where we get to use computational universality explicitly for our own ends.

Note that even the most trivial of systems are indeed “universal”, as has been proven by Cook and Wolfram via systems such as Cellular Automaton #110 and Turing Machine #596440 (in Wolfram’s numbering schemes), which has only 2 states and 3 colors. This has led to Wolfram to proclaim a principle of “computational equivalence”, where almost ANY complex looking system is most probably universal, and hence is commonplace in nature and the universe.

But as Aaronson points out in his critique of Wolfram’s results, this has been the conventional wisdom of computational complexity theorists and others for some time now, and does not ‘really’ add any new insights. Moreover, in the case of CA#110, there is an exponential slowdown in simulations of other universal systems, and in the case of TM#596440, there is also the problem of the complexity in formulating proper initial conditions to meet a programmer’s desires.

So even as these mere examples would show, it would ultimately seem that computational complexity is at the heart of many philosophical questions about theoretical computer science, including the idea of universality.

(If only there was somebody actively trying to convince philosophers why they should care about computational complexity. 😉 )

>”I would hope that at least some philosophers find such questions interesting!”

Well, it would seem they already do, including yourself! =)

Disclosure: I am not enrolled in this course at MIT.

• HGM says:

The response that computers exhibit universality (and it’s true this may depend on the specific sort) seems to me to beg the question at hand. The question is why should we think that a computer is a better metaphor than were those of clocks, steam engines, etc.? The whole point of the metaphor is that it explains something (in this case, the operations of the cosmos) in a useful way. Certainly it seems that explaining natural processes in terms of computation is more fruitful than doing so in terms of clockwork or fluids. But the universality of these explanations is exactly what’s at issue. So we can’t appeal to their alleged universality to justify the claim that they work so well, or even better than past candidates. The task is to break free of the circle in which we appeal everywhere to the computational metaphor because it seems to work, and then explain why it works in terms of its broad application. That’s not to say this metaphor isn’t quantifiably and quantitatively better than its precursors, just that it also needs to be made clear why it’s better.

Also, I’m fairly certain from reading Scott’s blog that he doesn’t feel it actually is a metaphor at all, but a genuine explanation. I’d be very interested to hear what the philosophical argument is for that view.

Another Remote Follower

7. Katrina LaCurts says:

We discussed in class that if The Halting Problem were decidable, it would lead to great mathematical insights. For instance, if we had a Turing machine T that could decide if an arbitrary program halted, we could run T on the following program P:

```for n = 4:\inf, n even:   if n is not the sum of two primes:     stop```

Knowing whether this program halts gives us the solution to Goldbach’s conjecture, and other conjectures could be “solved” similarly.

However, T isn’t guaranteed to actually give us any insight into the problems. For one, T may take arbitrarily long to decide whether P halts, perhaps longer than it would take us to prove Goldbach’s conjecture by standard means. Second of all, even if T outputted yes or no in a reasonable amount of time, it wouldn’t necessarily provide us with a proof of Goldbach’s conjecture (even observing T’s operation might not give us the necessary insight).

How would the study of mathematics change if such a machine existed? Would we really be content to with “T says Goldbach’s conjecture is true, so let’s move on”, or would we still strive for an elegant and concise proof? Or on the other hand, would it be the case that knowing the solutions to such conjectures would open up whole new branches of mathematics that we wouldn’t have been able to explore without these solutions, so in some sense being able to “move on” (even without a concise proof) would be beneficial?

• Scott says:

Katrina: These are fantastic questions, ones we’ll definitely come back to later in the course!

A famous “real-world” case study for these questions is the Four-Color Theorem, which was proved in 1976 with the help of massive computer search. Even though we now know the answer, many eminent mathematicians have expressed their dissatisfaction, since we still don’t have a proof that provides “understanding” or “insight” about why four colors suffice.

On the other hand, many, many people (myself included) whose ultimate goal is to prove theorems, find themselves constantly resorting to computer experiments (in some cases, more often than they admit! 🙂 ), to guide their intuition and get a sense of what intermediate statements they could try to prove. So I think that the super-machine T, if it existed, really would change mathematics enormously.

One technical correction: assuming T was provably correct, a trace of T’s execution when it was fed Goldbach’s Conjecture would be a proof of Goldbach’s Conjecture! But you’re right that it probably wouldn’t be a useful or understandable proof, and that a lot of further work—probably involving followup questions to T—might be needed to extract such a proof.

(Incidentally, notice how quickly I’ve been tempted away from my policy of not commenting on the reaction essays… 😉 )

• Katrina LaCurts says:

I hadn’t even thought of the Four Color Theorem when I was writing that comment! Initially I was very much on the side of “T isn’t sufficient; we still need elegant proofs”, but I’m a proponent of the proof of The Four Color Theorem. Now I have no idea where my allegiance lies 🙂

• Qiaochu Yuan says:

Practically speaking, it seems to me that a more pressing question is this: how could we ever verify that we had such a machine, given that we can’t “check its work”? This question recently came up on math.stackexchange and I have to admit that I don’t currently know what to make of it. Maybe Scott has some ideas…

Regarding understanding and insight in mathematics, two excellent essays on this subject I highly recommend are Thurston’s On proof and progress in mathematics and Tao’s What is good mathematics?

• Scott says:

how could we ever verify that we had such a machine, given that we can’t “check its work”?

Well, if the machine outputs proofs of its assertions (say, in ZF set theory), then even if we don’t understand the proofs, we can certainly check their validity. Welcome to the wonderful world of P vs. NP! 🙂 We’ll have a lot more to say about this sort of issue later in the course.

• bobthebayesian says:

B.K.P. Horn at MIT has also raised questions along these lines, but with respect to empirical learning such as machine learning. Here are some snippets from an abstract to one of Horn’s seminars from a few years ago:

“If a system does learn how to do something, what have *we* learned? Often retrospective analysis of “learning” systems yield insights, but not the expected ones.

Hoping a system will “learn” how to do something can be a way to avoid doing the hard science necessary to solve the problem properly. And in the end often not much may be learnt by the researcher, since typically the learned state of the system is inscrutable.

Garbage in => garbage out applies to “learning” as much as it does to computing in general. Extreme efforts to advance the algorithms and mathematics underlying “learning” may not pay off as well as simple efforts to provide better inputs by understanding vision better.”

(There are also some slides available here: , although they are very terse, vision-oriented, and only have a few good nuggets here and there. )

The Shieber and Levesque readings argue that the Turing test is inherently statistical and interactive, ‘getting the behavior right.’ And Shieber provides a nice way of formalizing Block’s request that passage of a Turing test provide evidence of “the general capacity” to pass such tests. I chose to bring up Horn’s comments above because they apply to machine learning which many people feel is our current best approach to handling the large-scale data processing required for a machine to be able to reason and exhibit understanding, e.g. strong A.I.

But just as you ask the question, “Would we really be content to with “T says Goldbach’s conjecture is true, so let’s move on”, or would we still strive for an elegant and concise proof?” one could form the analogous question w.r.t. the Turing test. If an advanced machine learning system starts to exhibit human-quality image and scene understanding, but the underlying methods don’t reveal to us any insight about how the “seeing” is getting done, are we (should we be) content to declare the system as intelligent? One purpose for declaring it to be intelligent could be to decide how such an agent should be ethically treated. Another purpose could be as a means for standardizing its capacity to perform tasks for an employer. It’s hard to imagine that we can decouple the point of a Turing test (to declare something ‘intelligent’ or not) from what we plan to do as a society with that declaration. And these questions are inherently philosophical.

It echoes Turing’s original reason for changing his question from “can machines think?” to “can machines pass the Turing test?” — knowing what intelligence is in a manner that’s generalizable and applicable across problem domains is vastly different from replicating intelligent behavior. I would even argue that replicating intelligent behavior is to understanding the processes of intelligence what multiplying two numbers is to factoring any generic large number.

• Scott says:

If an advanced machine learning system starts to exhibit human-quality image and scene understanding, but the underlying methods don’t reveal to us any insight about how the “seeing” is getting done, are we (should we be) content to declare the system as intelligent?

That seems to me like an unfair standard. For we don’t demand insight into how other people’s mental processes work, before declaring them to be intelligent!

Interestingly, in practice the opposite of your requirement often seems to be applied: it’s precisely when people understand how something works that they regard it as unintelligent or a “mere automaton”!

Anyway, the debate between the people who insist on understanding how an AI works, and those willing to give up on that requirement, has probably been the single greatest divide within the AI research community since its beginnings—with Minsky, Chomsky, and McCarthy on one side, and almost all modern machine-learning researchers on the other. To me, part of what makes this debate so exciting is that it’s only partly philosophical: we can also look at which side is having more empirical success at solving AI problems! And on that score, the “don’t need to completely understand how it works” side seems to have been winning hands-down for the past two decades.

• bobthebayesian says:

Scott, the prospect of there existing non-human agents which are intelligent to a human degree, potentially deserving ethical treatment, etc. is a totally new concept (in the sense that it’s starting to look like a matter of engineering to achieve it) and my argument is that machine learning is making the prospect visceral to more people at a faster rate. There could easily be a time in the not too distant future when we actually *do* demand insight into (what appears to be) another human’s thought processes before we make judgments about it that have ethical consequences. I’m not saying this is at all likely; I think strong A.I. is probably far off… but the whole idea of a non-human being declared intelligent also makes us re-asses the circumstances under which we would classify something human-looking as intelligent too.

• I would point out that because it’s easy to explore the consequences of an answer, most of the interest in unsolved conjectures is in the novel techniques that would be used in a proof. I suspect the Clay Institute would be reluctant to give the full million dollars for an entirely opaque computer proof of the Riemann Hypothesis.

• bobthebayesian says:

I strongly agree. I think Horn’s remarks are meant to say, “if your goal is to simply replicate ‘intelligent’ behavior in order to capitalize on some consumer market or to monetize some subset of science, then machine learning is clearly the least risky way forward (especially if the thing to monetize is very narrow in scope), but if your goal is to ‘understand intelligence’ then investing in fundamental research is a better bet.”

For what it’s worth, my own view is that yes, machine learning and “brute force” methods are winning hands down at replicating certain behaviors that are monetizable and generally narrow enough in scope that massive data processing trumps ‘elegance.’ But this is a poor, “local maximum” incentive scheme for scientific researchers. In my own field of computer vision, I am baffled by how few researchers care for fundamental science; almost every research project I have been a part of has started out not by asking “how can we understand the process by which images are perceived in a mind and then invert that process?” but rather has always began with “what is a glamorous application that people will pay for and how do we efficiently replicate the intelligence needed?”

For example, the famous investors Peter Thiel and Max Levchin, along with chess grandmaster and political activist Garry Kasparov, are set to release a book in early 2012 and the central theme of the book is that despite recent explosions in computing capabilities, human innovation has drastically slowed over the past 20 years. (The title is “The Blueprint: Reviving Innovation, Rediscovering Risk, and Rescuing the Free Market” and there is an Amazon page for it.)

I think this might be a concise way to put it: intelligence in many endeavors is exposed by sensing certain phenomena. If those phenomena remain totally mysterious to us, then labeling the process by which they occur as “intelligent” doesn’t carry any weight. This applies to human phenomena as well as machine. Calling it intelligent merely means “it is as effective as if a human undertook the phenomenon” but still leaves it as a mysterious answer to a mysterious question. The litmus test of science has always been taking a mysterious phenomenon and making it less mysterious, which is why I consider machine learning a part of numerical analysis rather than science.

Throughout most of human history, no one ever thought to question whether or not the common human experience was a useful or appropriate gauge of intelligence. This is just how the first version of the Turing test was defined, culturally; it’s not intrinsically meaningful to compare machines to common human behavior. And further, whatever we mean by human intelligence surely changes over time as well. I think the really interesting aspect of the Turing test is: what practical meaning does the result of the Turing test have? If declaring something ‘intelligent’ has absolutely no material consequences, then it’s just a pedantic matter of definitions and we can say Cleverbot is “intelligent” and move on. But if there are actual stakes that hang in the balance, depending on whether Cleverbot, say, really is “intelligent” or not (they could be ethical stakes, civil rights, employment, etc.) then something important rides on the outcome of a Turing test, and this is where philosophy really starts to matter a lot; this is where the underlying, mystery-reducing explanations might actually matter.

• Well, no. Machine learning is engineering. Maybe human intelligence works on similar principles, or maybe we just can’t build “better” algorithms that match up, but machine learning is what’s giving results right now. Engineering has always been happy to innovate first and let others sort out the details and explanations.

There is no such thing as intelligence; there is only aptitude for particular tasks. Humans are decent at some and terrible at others. Intelligence usually refers to some subset of the former, which varies by time and context. The question, then, is what abilities we choose to classify entities by — which ones are people, with the rights that that implies, which are only machines or tools, and what happens to the ones in between.

• bobthebayesian says:

Things like PAC, kernel learning, or VC dimension are numerical analysis; but you’re right that many engineers implement algorithms that approximate these and other concepts to achieve short term goals in applications and that subset of machine learning is engineering.

Like you, I was using the term intelligence to refer to aptitude to perform tasks. I just think that your statement, “The question, then, is what abilities we choose to classify entities by — which ones are people, with the rights that that implies, which are only machines or tools, and what happens to the ones in between.” is the entire interesting part of the Turing test. These are definitions that have practical implications. If passing the Turing test doesn’t address “which [Turing test takers] are people, with the rights that that implies, which are only machines or tools, and what happens to the ones in between” then it is just a mathematical curiosity and a matter of pedantic definitions whether we call something intelligent or agree that is has aptitude.

• Scott says:

bobbayesian: I confess that any argument of the form “well, you might be getting the right answers, but that’s just cuz you’re a crass engineer who’s trying to make money rather than a deep thinker” leaves me completely cold. People might have any number of noble reasons for wanting to get right answers: for example, maybe they’re biologists who want to use machine learning algorithms to predict effective cancer treatments, or conservationists who want to use them to predict the locations of endangered species. Or they might have crass or evil motives. I think we ought to support the good applications of science and fight the bad applications (although I don’t think making money is inherently bad—it becomes bad when coupled with callous indifference to side-effects). On the other hand, if people are consistently getting right answers to problems that resisted previous approaches, then whatever their motives, I’d say they must be doing something that encodes genuine scientific understanding—even if they themselves don’t possess that understanding, or care to possess it. As one instantiation of that view, I’d say the spectacular success of (e.g.) hidden Markov models at machine translation really is an argument that HMMs encode important insights about human language, whether or not we’ve managed to extract those insights. I’d be wary of any position that let me dismiss empirical successes on a priori grounds—I don’t want to have that freedom!

• bobthebayesian says:

I think I am not writing effectively and I will make more of an effort to keep my input to smaller comments in the future. I’m not trying to say that engineering is crass and I’m glad people are working on overfitting HMMs to solve specific medical problems today. Making money is not inherently evil, but one of the negative side effects you allude to is that whole fields of science can be distracted and kept busy on inefficient problems just because that’s where the money is. Fundamental research is a riskier investment because there’s no guarantee it will help the investor more than it will help her competitors. But this is a bad reason for sometime to look at an engineering problem as an opportunity to hack together an unreasonably huge number of hidden states in an HMM, or to model a classification problem by concatenating together dozens of different feature vectors that may or may not have any justification for being included.

I’m not so quick to dismiss the point of view of someone like Horn just because current research can make effective results for contrived, narrow problems. My goal is to be able to give an informative description of the whole process by which input data is converted to an output decision. If any part of that process is still mysterious, we can zoom in on it and study it until we understand. Of course I want the output to be state of the art and helpful to mankind and lucrative, but I think the best way to achieve all of those things in the long run is to actually have a comprehensible way of understanding the whole chain of processing, letting no part of it be a black box.

I also prefer Yudkowsky’s idea of intelligence from LessWrong, optimization power. How well can an agent zoom in on some probabilistically tiny piece of the problem domain? The thing is, if intelligence as a label has any consequences, then people have to have a philosophical discussion about what is meant by “probabilistically tiny” and what is meant by “how well” an agent can achieve this.

This is why I preferred hjpev’s comment above: “The question, then, is what abilities we choose to classify entities by — which ones are people, with the rights that that implies, which are only machines or tools, and what happens to the ones in between.”

What I’m trying to argue is that Levesque and Shieber make a good theoretical case for why the Turing test does not suffer Block’s or Searle’s criticisms. But this doesn’t necessarily help. People are still going to try to change the definition of intelligence every time a machine seems to do well on a current Turing test. Eventually, though, they’ll have to start changing the definition of intelligence in ways that effect whether or not a human is intelligent, and this is not something that’s even really been thought about. We just take it for granted that the “human standard” is the one we care about, and this is why Turing made his initial definition the way that he did. But maybe if machines become successful enough at getting the behavior right, and we *still* want to engineer this or that ethical conclusion about how we can treat machines (i.e. can I unplug it and throw it in the trash if it passed a sophisticated enough Turing test?), we’ll have no choice but to decree that you don’t pass a Turing test unless you get the behavior right and the algorithmic process has sufficient optimization power, say. We won’t necessarily be satisfied declaring something to have the true aptitude to do a task unless we know the inner workings of how it does the task, not just the fact that it can do the task. And this may cause us to drastically change the way we verify intelligence in human to human interactions. There’s no reason to suspect it won’t; we’ve never had technology sophisticated enough to make it a relevant issue.

8. Tomeru says:

“What _is_ the Church Turing Thesis?” – I don’t have a clear-cut answer, but I want to explore the question a bit with comparisons to other definition in physics. I think the Church-Turing Thesis (CTT) has the flavor of a definition in a physical theory, but is also phrased in a slightly odd way compared to other physical laws.

First, the similarities: ‘Computation’ as a formal concept is might not be different than other scientific ideas which were informed by psychological intuitions but formalized by physical definition. Take the example of ‘Heat’, or ‘Force’, or ‘Energy’ or ‘Momentum’. All of these concepts have more or less rigorous definitions within modern physical theories, but their development drew on the psychological phenomena that we experienced. One might argue that unlike ‘computation’, ‘force’ in Newtonian mechanics can be measured. However, it’s not clear that that’s the case. Acceleration can be measured, and so can mass. However, force appears to be a hypothetical entity posited by Newtonian Mechanics to tie the two together. ‘Computation’ in this analogy is similar to ‘Force’, with the rigorous measurable stuff (acceleration, Turing Machines) going on the other side of the equation.

So, if the CTT is equivalent to a statement in physics, does that mean it’s an empirical statement? Not necessarily. Consider that in the form “F=m*a”, the second law is not an empirical statement about ‘forces’. We have no objective way of measuring a Newtonian force outside the framework of Newtonian mechanics and making sure that it is indeed m*a. The measurable stuff in on the other side. In a similar way, we have no agreed upon objective way of measuring ‘computation’ outside of the CTT. Thus the CTT is a definition, but an informative definition which sets up a whole theory, like F=m*a sets up a usable, working theory of physics.

What of the differences? First, the laws of Newton are not defined as a statement of equivalence or belief, along the lines of ‘All systems of mechanics will turn out to be equivalent to Newtonian Mechanics’. It certainly could have gone that way. We could imagine a hypothetical world in which the work of Hamilton, Lagrange and Newton all coincided, and finding out that they all make equivalent statements about mechanics (I’m brushing over a lot of generalizations about Hamiltonians/Lagrangians and what they can and can’t handle), leading to the Hamilton-Lagrange-Newton Thesis: “All systems of mechanics will turn out to be equivalent to Newtonian Mechanics” (HLNT).

This leads to the second difference between the HLNT and the CTT. The HLNT in this formulation doesn’t deal with forces, or fields, it deals with ‘mechanics’. That is, measurable stuff in reality like location and velocity. In that sense, the HLNT can and was proven false. The CTT is not falsifiable in this sense because it still deals with the invented concept of ‘Computation’. It’s as if the HLNT said “All forces will turn out to be equivalent to Newtonian forces”.

Where does all this leave us? As I said, I don’t have a clear-cut answer. It seems the CTT is both similar and not similar to statements of laws and concepts in physics. Perhaps this simply isn’t the right analogy. I’d like to offer one last such consideration, though, this time between the CTT and the physics statement ‘Energy is always conserved’. This last statement is also seems empirical and a statement of belief at the same time, and it leaves open the question of what exactly energy _is_ (Feynman has a particularly interesting way of dealing with this – he says it doesn’t matter). In fact, this law/statement was formulated long before we discovered many forms of ‘energy’ which turned out to be equivalent, and conserved. In the 20th century, it was discovered that the conservation of energy stems from a more basic idea – symmetry of time. Perhaps there is something more satisfying, in a similar vein, which explains _why_ all physical computation is equivalent to a Turing machine. Perhaps there is such an explanation already, I’m afraid I’m not familiar enough with the computation literature to know. All the more reason to take the class.

• I wonder if the universality of the Turing machine has something to do with the effectiveness of mathematics in describing physics.

9. bobthebayesian says:

Relevant: .

• bobthebayesian says:

Sorry, tried to post a link: www geekosystem com / cleverbot-passes-turing-test /

10. One of the criticisms to the practice of philosophy that was mentioned in class is:

“There is no measurable notion of ‘progress’ in philosophy.”

I found the counter-argument particularly compelling:

“As soon as there *is* some measurable notion of progress towards answering a given question, then it must be because the question has been sufficiently formalized to engender a new field in its own right or because it has been ‘eaten up’ by existing fields.”

Artificial intelligence finds itself in a similarly unpleasant spot: the reason that AI could seem to never make real progress may be that every time progress is made (and understanding for how to achieve the underlying task is obtained) the new achievement is retrospectively called “obvious” and, moreover, not indicative of any progress in understanding *true* intelligence.

So, one can view philosophy as providing a source of good questions to ask, of likely fundamental interest, along with preliminary investigations/explanations. At the very least, we know of some fields that did experience such a transition:
– logic;
– the ‘natural sciences’ (physics, chemistry, and biology); and
– the study of language.
I wonder: are there more recent examples of such transitions?

On a separate note, have there been instances of recognized ‘progress’ that today are still regarded as achievements of philosophy (and have not been stolen by the sciences yet)?

Maybe one could argue that science of morality (cf. Sam Harris, Steven Pinker) does constitute some philosophical progress. Namely, as our understanding of science, especially in the fields of psychology and the brain, improves, so might our ability to reason about morality, because we can view it in better and more accurate contexts. Moreover, since it is hard to imagine how one would prove theorems in science of morality, this particular direction is unlikely to be stolen from other ‘harder’ sciences and is likely to remain within the scope of philosophical debate.

• Scott says:

Thanks, Ale! I guess one thing I forgot to say in class is that my own interest in philosophy is extremely selfish: I’m constantly scouting for questions that I can “eat up” and mathematically formalize, thereby removing them from the domain of philosophy! 🙂

• aljacen says:

Moreover, since it is hard to imagine how one would prove theorems in science of morality, this particular direction is unlikely to be stolen from other ‘harder’ sciences and is likely to remain within the scope of philosophical debate.

Perhaps so, but on the other hand moral philosophers make statements about reason all the time that I think are susceptible to attack (or rather more politely, clarification!) from TCS. One could ask, for instance, to what extent is utilitarian calculus computationally intractable? Or does Kant’s algorithm for moral reasoning even halt?

• bobthebayesian says:

Additionally, it seems that humans, when they undertake the action of “computation”, say when multiplying two numbers, then will study the relevant inputs, then think for a bit, then maybe write something down, then think some more, then finally declare a certain answer to be correct. We pretty much all agree that doing this process for multiplying two numbers is computation (perhaps that can be debated, but it seems unlikely).

Well, when people make moral choices, what do they do? They inspect the facts and inputs they think are relevant, then they go think for a while, then they might write some stuff down, then think a bit more, and then declare some course of action as being the correct (moral) course of action (at least when they are choosing which action they should adopt).

It seems like we do the exact same thing whether we’re doing arithmetic or thinking of a moral action. It may take longer to decide the moral action; it may also involve inputs or facts that are not traditionally part of “computation” (such as how someone else feels). But the fundamental process is computation nonetheless.

It seems to me that morality is just computation. In fact, it would seem that people make practical moral choices all the time in such a way that I would be hard-pressed to believe that moral computations weren’t efficient… and the ECT thesis definitely plays a role. Morals are being computed efficiently in the physical world… does that mean that moral problems are in P? If we learned that moral problems were in NP, what would this mean? The example of utilitarian calculus is very salient, especially in light of bounded rationality.

• In fact, many areas of science have phenomena that can be seen as computation, for example the interactions of biological molecules or quantum states. This is a specific case of that applied to psychology, which is special in that the human brain is the first machinery we learned to compute with, before the mechanical, electronic, biological, or quantum came along. But it doesn’t seem that morality is why we should apply philosophy to TCS, except insofar as it’s evidence to the general applicability of TCS to various far-flung and seemingly unrelated fields.

11. kasittig says:

Although we extensively discussed various reasons for simultaneously studying philosophy and theoretical computer science, I believe we missed a key point – that thinking about theoretical computer science in a philosophical manner forces us to contextualize many of its key concepts, which adds to our understanding of the field as a whole.

It seems to me that theoretical computer science does not often ask the question of “why” – rather, computer scientists are more concerned with what can and can’t be proven. In debating philosophical ideals, however, we’re often forced to justify our assertions and our decisions. It would feel quite strange to just give a series of lemmas in a philosophical argument – and it would feel even stranger to have this argument with an individual who never asks for the motivation behind the argument as well. By debating from both a theoretical and a philosophical standpoint, we can break free of our temptation to deal only in that which we can mathematically prove – and in doing so, we’re often forced to justify why we started dealing with the problem in the first place.

Consider the simple example of the halting problem. We saw the mathematical proof of its incomputability in class – but, of course, this proof does nothing to justify why we care about this concept of computability in the first place. When we start to ask more questions – like “why is this important” or “what if the opposite were true”, we not only start to understand the problem itself better, but we are better able to explain its importance to others. And, of course, interesting concepts may arise from this discourse as well – such as other problems that may be unsolvable because of the halting problem’s incomputability.

Had we not attempted to view theoretical computer science from a different angle, we may never have asked these questions, and we may never have had these discussions. So, it is the fact that adding philosophy to the mix forces us to look at the bigger picture, and to answer the larger questions, that really adds value to our study of theoretical computer science, and therefore it is valuable for scientists and mathematicians to spend time thinking about philosophy as well.

12. amosw says:

The word “universal” seems to be used quite freely, and rather imprecisely, in
computer science. We hear people say that a computer system is universal
“modulo space restrictions”. To anyone listening who is thinking of a computer
as a physical object of finite rest mass, this statement is a singular
construction: rather like saying that one’s car could travel infinitely far
“modulo gasoline restrictions”.

In class, a description of the Scheme programming language as universal was
defended as “approximately true”. This sounds absurd on its face, but it hints
that its perhaps its speaker believes that Scheme could simulate any finite
system if we could somehow supply it with a sufficiently large finite number of
cons cells. So perhaps we should agree to call a system universal if it can
simulate any finite system. But as Professor Aaronson pointed out in class, we
have good reasons for believing that a non-zero cosmological constant dissallows
even this from being physically realizable, as the Universe is expanding too
fast for a computer of sufficiently large size to communicate at the speed of
light from one of its edges to another.

We might try to salvage the word univeral by agreeing to say that a computer is
universal if it can simulate every finite system that we might possibly care
about. However, as was pointed out by a student in class: “we still don’t have
a mole of bits”. Integrating the molecular dynamics equations of motion for a
1cm cube of Argon atoms is a problem of great interest to physicists, but it
seems unlikely that we will ever succeed in building a machine capable of it.
Certainly none of the machines we have today are capable of it, and yet we
persist in calling them universal.

Most working mathematicians spend no time worrying about the fact that an
uncountably infinite set is believed by physicists to be physically
unrealizable. Mathematicians are for the most part unconcerned with physical
reality at all: they are working in a realm of intuition and logic, bound only
by logic and, perhaps, taste. So maybe working systems computer scientists
should take the word universal in the same way that an experimental physicist
takes the definition of a contour integral given in analysis: a mental construct
that gives good approximations to experimentally observed facts, whose
philosophical difficulties can be ignored in the interests of getting on with
the business of science and engineering.

13. wjarjoui says:

Why should theoretical computer scientists care about philosophy? I would rephrase that as what can philosophy offer to theoretical computer scientists. In some sense it offers them a way to connect with a deeper concept in life, and in that regard, helps them answer some of the eternal questions – what is life, what is the universe. In this sense, the connection between philosphy and computer science is fashionable: as we have discussed in class, previous sciences were also highly connected with philosophy in times past. And as one would argue that TCS is unique because the universality that a Turing Machine offers, one can argue that all of those other sciences previously connect with philosophy are also distinctly unique, in different ways other than universality.

In a lot of ways philosphy is a force driving the evolution of research and sciences in general. The tools it provides allow us to make sense out of scientific discoveries and use them to understand deeper aspects of life – what is it for instance. Hence it seems that TCS without philosophy would not serve the evolution of human thought, but only the evolution of human technology.

Having said that, why should studying philosophy and TCS solve any new wordly problems when studying philosophy and previous sciences did not? Is TCS the epitome of all those sciences? I wouldn’t say so, so yes it might be a waste trying to seek insight that either: a) you know you’ll never find because many before you couldn’t and there is no indicator that you have more progress over them or b) might not do you any good anyway.

In any case, about a different issue that caught my attention in class, and that is the harsh critique strong AI always receives. Whether it’s machine learning or other techniques, usually the AI machine is being compared to a human who has had many more years of training. If we were to train a machine with the same amount of input a human receives of the course of his/her life, it is hard (for me) to imagine that this machine would not be able to obtain similar if not exceeding intelligence patterns. Now we can argue that a machine will need a lot more space (memory) to be able to conceive what a human mind does, but that’s a problem of space and not algorithms/computation. (We could then argue that maybe intelligence does require having such a space/memory, and if we build a machine with a big enough memory we can argue that the memory is not complicated enough to portray intelligence, but then we might as well argue that intelligence requires a human brain…).

• bobthebayesian says:

I agree with you that resource limitations are not a good critique of strong A.I. However, in computer vision at least we already can and do train object recognition algorithms using more images than what a human being sees during its entire life span, and we’re nowhere close to achieving human (or even lower primate) level recognition, not even for a restricted domain like face recognition. This is why I feel disparaging towards the larger machine learning methods. Many researchers focus on how to process more data, but not necessarily what kind of processing to do to the data. The software we’ve evolved in human minds to do face recognition is amazingly efficient and has quit a lot of optimization power. Engineering approaches currently don’t have any idea how to replicate the human-level performance, but perhaps with enough powerful computer and enough data, they will be able to brute force a solution to face recognition in the next 10 years. They will still not be any closer to knowing how to do accurate face recognition with the computational resources of a chimpanzee’s brain, though, and I think that’s more fundamental to ‘intelligence’ than just the output.

• Remote Follower #0 says:

>”Having said that, why should studying philosophy and TCS solve any new wordly problems when studying philosophy and previous sciences did not? Is TCS the epitome of all those sciences? I wouldn’t say so, so yes it might be a waste trying to seek insight that either: a) you know you’ll never find because many before you couldn’t and there is no indicator that you have more progress over them or b) might not do you any good anyway.”

Hmmmm….interesting. Well, though you would say no, I would claim that TCS is in fact the epitome of all prior sciences. And not only that, I would go as far as to claim that it can (and should, even currently) solve any and all new worldly problems.

OK, while I guess that is a bold statement and would take some time to fully explain, let me spurt just a few, random selection of thoughts that are currently bubbling in my mind as I write this:

1) Intuition?

2) Assuming the ‘other’ sciences are ultimately just us sensing things change states over time and recording them (making data), which in turn has us changing states over time and recording them (making theories), ad infinitum, then are we not just simply ‘computing’ things, so that TCS is in fact special insofar as that it studies this computing?

3) To the extent that Economics is the social science that analyzes the production, distribution, and consumption of goods and services, and is good for the people, then what does that say about TCS, as defined as: the exact science that analyzes the “production, distribution, and consumption of goods and services”?

4) Optimization problems fall under the domain of TCS. So where something like Utilitarianism is an ethical theory holding that the proper course of action is the one that *maximizes* the overall “good” of the greatest number of individuals, can TCS not have value here, letting us not only know how to do such things, but whether it is currently feasible?

5) Is there still not hope that, contrary to evidence, P==NP, where, as Godel puts it, it would have consequences of the greatest magnitude? e.g. optimal schedules/routes, efficient designs, etc.

Disclosure: I am not enrolled in this course at MIT.

• wjarjoui says:

You make good points.

But I was strictly referring to natural sciences, sorry for not being clear on that.

14. Artur Dmowski says:

According to Church-Turing thesis, we should be able to build a Turing machine that will compute any given computable function. This has a tremendous impact on philosophy because there are many important questions that are indeed computable. For example,

f(x) = 1 iff “God exists”

is a computable function as shown in class. Therefore if we were able to build a universal Turing machine with infinite resource, we would be able to solve most of the philosophical questions that bother humanity.

However, physical constraints may limit the size of a Turing machine. If the universe expands outward, and at some defined distance bodies move away fast enough, then there is a finite amount of atoms that can be accessed by a Turing machine (i.e. we cannot move faster than the speed of light). This theory is widely conjectured to be true based on a collection of experimental evidence.

Therefore, it seems that Church-Turing thesis is a claim about the physical world. It is analogous to the support of the idea of a cosmological constant.

• Scott says:

Therefore if we were able to build a universal Turing machine with infinite resource, we would be able to solve most of the philosophical questions that bother humanity.

Sorry, the argument you gave is incorrect. In order to compute a function, it doesn’t only need to be computable; we also need to know a Turing machine that computes it! Indeed, the whole point of Sipser’s example was to illustrate this distinction.

15. D says:

One point that was touched upon a couple of times in lecture, but not fully elaborated upon, was the concept of the interchangeability of programs and data. This underlies a number of the (historical and current) foundational concepts of theoretical computer science, but given the parallel between computation and thought, it also seems to serve as a metaphor for philosophy itself.

The ideas of universality, computability, and simulation all rely on the idea that a computation can be expressed as data. In order to have a universal machine, one must specify what computation it should perform; this specification is fed as data input to the universal machine. While what this data looks like could be highly implementation-dependent, it’s often the case that it is expressed as a “program” in the modern sense: a sequence of small basic operations that can be performed in turn to simulate the step-by-step operation of a special-purpose machine. In fact, the notion of computability itself relies on the expression of programs as data, and proofs of incomputability often involve deriving a contradiction by using a program as input to itself (or a related program).

Given that we mentioned the idea of computation as thought, there is a direct parallel with philosophy as a subject. TCS describes the equivalence of programs and data–that is, computation can be performed where the subject is essentially another computation. Philosophy itself is the study of knowledge–the act of engaging in thought about thought itself. There’s a direct parallel in that both are engaged in a meta-level study about the actions of processing information, either by a human or by a machine.

• The Church-Turing thesis, then, would suggest that all computation in the real universe could also be reduced to data to be fed to a Turing machine. And that raises the possibility of simulating a universe inside a computer, or, in the other direction, that our universe could itself be a simulation.

• bobthebayesian says:

See Nick Bostrom’s interesting work, e.g. www simulation-argument com / simulation pdf.

16. D.R.C. says:

One important definition that we are using, intelligence, is a very abstract concept to begin with, and it can be used in many different ways. I feel that door the purpose of Strong AI, we are specifically talking about human-like intelligence (still a very abstract concept, but just something to differentiate it from a more specific type of intelligence). I call it human like in that it is a very general intelligence, that given some basic information and rules, can begin to infer other information and “understand” it.

While humans don’t necessarily have a defined shared set of information, they do seem to have some basic knowledge of gathering this information (which I believe to be the primary aspect of human-like intelligence). This knowledge is likely passed down at the genetic level and contains certain very basic rules. One such rule that Minsky proposes is how to react when someone tells you not to do something, if it is a stranger telling you this, then it is likely the incorrect situation to do what you were doing, however if an Imprimer (i.e. someone to whom the you have become attached to) were to do the same thing (without qualifying it) then you would believe that the action itself was unworthy, and not just the situation.

Unfortunately, as it stands now, we don’t actually know if this is how our own minds works. Even if they do, we don’t have a list of even most of these guidelines (most likely, at least).Once question is how useful these particular concept of intelligence if we don’t even understand the basics of how we are intelligence. When we see behavior that seems intelligent and don’t understand the reasoning behind it (such as other humans) we say that it is intelligent, but if we (algorithmic) reasoning behind it, we simply say that it is simply a simulacrum of intelligence and doesn’t actually “understand” what it is doing.

17. bobthebayesian says:

Some interesting numbers to contextualize IBM’s Watson mentioned in class:

— 90 Power 750 Express servers, each with 4 CPUs, each of those having 8 cores
— total of 15TB RAM (yep, all of Watson’s data was stored in RAM for rapid search. The human brain’s memory capacity is estimated at between 3 and 6 TB, and not all of that functions like RAM, and it’s implemented in meat.)
— Each of the Power 750 Express servers seems to consume a maximum of 1,949 watts, making a total of 175kw for the whole computer
— There also appears to be a sophisticated system connected to the Jeopardy buzzer but I can’t find power specs for that part.
— IBM estimates that Watson can compute at about 80 teraflops (10^12). The Parberry paper mentions in passing that the human brain operates in the petaflop range (10^15), but at the same time, a brain is not a digital system and so the flop comparison is less meaningful.

To put this in perspective, a conservative upper bound for a human being standing still is at most about 150w — less than 1/10 of 1% of Watson — and the person just holds the buzzer and operates it with a muscular control system.

Each of the servers generates a maximum of 6,649 BTU/hour. Watson overall would generate about 600,000 BTU/hour and require massive amounts of air conditioning. I don’t know a good estimate on heat removal, but it would up Watson’s energy cost significantly.

I don’t mean to criticize Watson unduly; it certainly is an impressive engineering achievement and has generated a lot of good publicity and public interest in computing. The engineering feat is impressive if for no other reason than that it is the first accomplishment of this scale, and pioneering is always hard… future Watson’s will be cheaper, faster, and more effective because of IBM’s great work on this.

But at the same time, the amazing power and storage costs for Watson really kind of water it down for me. I’m not surprised that if you throw power and hardware and memory at a problem, you can use rather straightforward machine learning methods to solve it. I feel similarly about Deep Blue and chess.

A Turing test that would be more impressive to me would be building something like Watson or Deep Blue that is not allowed to consume more power than an average human, and has comparable memory and speed. The reason this would be impressive is that in order to build it, you’d have to have some way of representing data and reasoning in the system that is efficient to a similar degree that human minds are. Since this is an important open problem with lots of implications, we should use funding and publicity to drive research organizations like IBM towards that goal. Maybe building Watson is a first step and now the task is to miniaturize Watson, and in doing so, we’ll be forced to learn about efficient brain architectures along the way.

18. sbanerjee says:

When considering the merits of pursuing philosophy or incorporating its insights into our pursuit of knowledge, a lot of reasonable arguments are made from both sides. When arguing against the merits of philosophy, people argue that sometimes it is not clear whether or not philosophy makes progress [does it actually result in any conclusion or do philosophers just run in circles], there is a lot of disagreement amongst philosophers themselves, there is a lot of focus and discussion on what philosophers have argued in the past, and of course, there is the question of whether the questions asked by philosophy even meaningful or answerable. In response to these arguments, we did discuss a lot of counterarguments that support the study of philosophy in class. I want to focus on two specific counterarguments in this response essay because I believe it brings up a counterargument that has not been discussed.

The specific counterarguments are that scientists do constantly take philosophical positions [they just do not admit it and thus, there philosophy is an integral part of science] and that philosophy is a pre-science or a proto-science [until something actually has a field/science discussing and researching it, it is philosophy]. The interesting relationship between these and the argument against philosophy that it might consider questions that are not meaningful or answerable is that they bring up philosophy as a methodology to pursue new knowledge. While we do not consider eastern philosophy in this class, I would like comment on how eastern philosophers and theologians have been considering philosophy as a methodology to pursue new knowledge for many centuries now and in a manner that is different than that of science. This is another argument for philosophy – it allows for different approaches in the search knowledge. Its open-ended approach to problems that are not solvable through simple observation has proved to be very valuable as an alternate approach to understanding the universe. Science is a method of understanding the world via observation, but when facing issues where observation is not possible [issues of cosmological scale] eastern schools of thought have used philosophy as the foundation from which to understand the world via introspection. Essentially, what I am trying to express is that philosophy is a general foundation that sparks the search and pursuit of knowledge and it has been used by scientists and theologians in very different ways but for the same purpose.

19. Miguel says:

Scientists and mathematicians take implicit philosophical positions all the time, yet they seldom receive training on how to explicitly state and reason about them. This dearth might seem inconsequential, given the explicitly algorithmic nature of (the ideal of) scientific work. However, more often than not, ‘abstract’ differences in philosophical views have a concrete effect on the aim and product of science (e.g. philosophy as a pre-science). This has practical consequences: in particular, one can often gain insight on problematic aspects of theoretical results by examining their underlying philosophical positions –especially their inner difficulties.

Consider, for instance, the seemingly arbitrary polynomial/exponential distinction. An empiricist would justify the distinction on the grounds of its practical validity; a formalist, on the other hand, would emphasize the theoretical niceness of its closure and robustness properties. One could ask, however, why out of other conceivable classes of comparable mathematical convenience, the polynomial/exponential distinction is the one singled out. This objection, however, can be anticipated by examining the philosophical difficulties inherent in formalism: namely, its failure to account for normative differences between equally ‘formal’ constructs. This is one example where understanding the implicit philosophical viewpoints helps clarify theoretical results.

20. cuellar says:

It is very promising to consider complexity theory from the point of view presented in chapter 3. of WPSCACC. As stated, it occupies a privileged position between practicality and possibility (i.e. computability). It is then the ideal tool to tackle problems that are broadly considered possible to solve but whose feasibility is hard to discuss. In fact, when talking about complexity theory, the word feasible takes a whole new meaning: polynomial time computable. A good example of the utility of complexity theory is the discussion about artificial intelligence. While we know a machine that passes the Turing test can be constructed (e.g. with the lookup table), we also know such trivial constructions would be highly unpractical and, obviously, to this date we don’t know of any other practical way to solve the problem. This is where complexity theory becomes very useful by considering the more lax definition of feasibility. While considering a lookup-table robot is useless in the search for AI, a polynomial robot capable of solving the Turing test is much more promising.

I feel the need to point out an obvious problem at this point. It seems to me that it is natural to think that a program capable of simulating the human mind would be of a very high complexity and, even if it is polynomial, it might quickly require more memory than available in the observable universe. Sure polynomial programs have turned out to be efficient, but it seems to me that this is not necessarily the case for the harder problems out there. Moreover that might even be the reason why we haven’t been able to solve those problems in the first place. It’s not hard to see that there ARE polynomial-time problems that, with reasonable inputs, are not physically not possible to compute. I can’t think of any ‘natural’ problem that lies in such a category, but if I had to bet I would say AI could be one of them.

There is, of course, some points in favour of complexity theory. In the pass, problems that seemed extremely hard have been proven to be polynomial-time computable and, moreover, easy to compute in practice. Similarly, other seemingly ‘simple’ problems, such as number factoring, are not know to be polynomial-time and are very hard in general. But what I’m arguing above is that, while our understanding of computability advances, we will eventually find natural problems that are polynomially-computable but not in a practical sense. It might be the problems we consider today or something we will be interested in many years, but it is worth being aware of it .

• Scott says:

(1) We don’t know today that factoring is not in P — that’s “merely” a widely-believed conjecture (and indeed, if P=NP then it would be false).

(2) The argument that most AI proponents would give for why the Turing Test should not only be passable in polynomial time, but passable with “reasonable” computational resources (i.e., resources that would fit in the observable universe), is that the BRAIN ITSELF manages to pass the test using what seems like reasonable computational resources! (By many estimates, digital computers have already surpassed the brain in raw computing power, or at any rate will do so soon.) You might disagree with that argument, but I think you at least need to address it.

21. Remote_Follower_2 says:

Hi Scott,

This is a question about Neha’s Comment. Of course, you are not obliged to answer (but I would like an answer from someone). You mentioned that

an immediate problem is that it’s undecidable whether a given program belongs to L or not

and then you say in your demonstration of L’s failure to capture every computable function the following

suppose we also demand that there exist an algorithm to decide whether or not a given program belongs to L

I just wanted to clarify that such algorithms do not exist as the membership problem in L is undecidable and what you are doing is just making a “bold” assumption and then you try to see what this assumption leads to. If I got the preceding statement right (and I do not think I did) then it seems you are saying that even with this “power-up” the language L is pretty powerless to capture all the computable functions.

Could you (or anyone else) clarify what all I messed up above?

Disclaimer : I too, am a remote follower.

22. nemion says:

The first lecture was interesting because of the discussion about the utility of studying philosophy. As a scientist, the answer I prefer to the last dilemma is that the study of philosophy allows us to extend the boundaries of our experimental grounds by allowing us to perform thought experiments that otherwise would be impossible. As pointed out in class, the last has produced amazing new revolutions in science, when a field that starts as a philosophical one branches into the scientific arena.

Furthermore, I found extremely interesting our discussion about computability. Is the way we define computability the right one? The thought that the universe with which an observer can interact has bounded information is also extremely powerful. I was not aware of this principles of finiteness of information in a closed volume nor I was about the finiteness of the piece of the universe we can interact with. Both amazingly interesting topics.

I can think of a myriad of interesting related questions: boundedness of knowledge? is the universe a computational machine? is it a deterministic one? does there exist randomness? from an insider’s perspective? from an outsider’s perspective? what type of computational machines are we? what type of computational machine is our brain?