| Navigation |
| The Institute
|
| People
|
| Activities
|
| Laboratories
|
| Publications
|
| Positions
|
| Grad Program - New!
|
|
|
 |
| Street Address |
475 Wes Graham Way
Waterloo, ON
|
|
| Contact |
200 University Ave. W.
Waterloo, ON N2L 3G1
P: +1 (519) 888-4021
F: +1 (519) 888-7610
|
|
|
|
|
Quantum Computing Since Democritus (fall 2006)
|
|
| Code: PHYS771 |
Semester/Year Offered:
Fall 2006 |
| Instructors:Dr. Scott Aaronson |
| Location: TBA |
Time: |
| Website: http://www.scottaaronson.com/democritus/ |
Calendar Description:
This course tries to connect quantum computing to the wider intellectual world. We'll start out with various scientific, mathematical, or philosophical problems that predate quantum computing: for example, the measurement problem, P versus NP, the existence of secure cryptography, the Humean problem of induction, or the possibility of closed timelike curves. We'll then examine in what ways, if any, quantum computing affects how we should think about the problem. To keep things grounded, each session will end with a concrete puzzle that students will be expected to have thought about (if not solved) by the next session. The class format will strongly encourage participation, discussion, and debate. |
Lectures:
| Topic |
Hours |
Notes |
Atoms and the VoidAdministrivia
Problem sets. We'll have a few problem sets -- they're useful for you, and also they give me feedback about how much you're understanding. But here's the rule: you won't have to solve all the problems. You're allowed to write down, "yeah, Problem 4's a real stumper. I thought about it, I don't know the answer. Here are some easier problems that I can solve." I'll evaluate that the same way I'd evaluate a research paper that said a similar thing. On the other hand, if you have no idea how to solve the problem but you pretend to know -- for example, if you write gibberish that goes on and on and on, hoping that something vaguely resembling an answer will be buried in there just by chance -- that counts negatively. You'd do much better by leaving the question blank.
Grades. I should tell you that I hate grades, I hate GPA's. I'll tell you why. Some people think grades are a bad idea because they reduce a complicated human being to a number. That's not what I think. I think, if you're going to reduce a human being to a number, at least do it in a statistically sensible way! "This person scores one standard deviation above the mean on this task within this population." An A-: what does that mean? It means the teacher decided to you give you an A-. Because maybe he said, well, this student got a B+ according to my arbitrary made-up algorithm that's different from everyone else's, but B+ is really on the borderline of A-, so I'll make it an A-. And these are math PhD's. And they don't notice the problem of infinite regress. And of course, for a less likable student, the B+ gets just as arbitrarily downgraded to a B.
All the same, I suppose I have to give you grades. So what can I say? If you show up, participate, do the scribe notes, do the problem sets, you'll do fine. If you want an A or A+, you might have to argue and disagree with me. (I mean, intelligently disagree with me.)
Alright. So why Democritus? First of all, who was Democritus? He was this ancient Greek dude. He was born around 450BC in Abdera, which was sort of this podunk town. where people from Athens said that even the air causes stupidity. He was a disciple of Leucippus, according to my source, which is Wikipedia. He's called a "pre-Socratic," even though actually he was a contemporary of Socrates. That gives you a sense of how important he's considered: "Yeah, the pre-Socratics -- maybe stick 'em in somewhere in the first week of class." (Incidentally, there's a story that Democritus journeyed to Athens to meet Socrates, but then was too shy to introduce himself.)
Almost none of Democritus's writings survive. (Some of them apparently survived into the Middle Ages, but they're lost now.) What we know about him is mostly due to the fact that other philosophers, like Aristotle, brought him up in order to criticize him.
So, what were the ideas they criticized? Democritus thought the whole universe is composed of atoms in a void, constantly moving around according to determinate, understandable laws. These atoms can hit each other and bounce off, and they can stick together to make bigger things. They can have different sizes, weights, and shapes -- maybe some are spheres, some are cylinders, whatever. On the other hand, Democritus says that properties like color and taste are not intrinsic to atoms, but instead emerge out of the interactions of many atoms. For if the atoms that made up the ocean were "intrinsically blue," then how could they form the white froth on waves?
Remember, this is 400BC. So far we're batting pretty well.
Why does Democritus think there are these atoms surrounded by void? He gives a few arguments, one of which can be paraphrased as follows (following Carl Sagan). Suppose we have an apple, and suppose the apple's not made of atoms but is instead this continuous, hard stuff. And suppose we take a knife and cut the apple into two pieces. It's clear that the points on one side go into the first piece and the points on the other side go into the second piece, but what about the points exactly on the boundary? Do they "disappear"? Do they get duplicated? Does the symmetry get broken? None of these possibilities seem particularly elegant.
Incidentally, some of you might know that there's a debate raging even today between atomists and anti-atomists. This time, the headquarters of the atomist side aren't in Abdera; they're a mile down the train tracks from here, in a certain sleek black building. At issue in this debate is whether space and time themselves are made up of indivisible atoms, at the Planck scale of 10-33 centimeters or 10-43 seconds. Ironically, the physicists have almost no experimental evidence to go on, and are basically in the same situation that Democritus was in 2400 years ago. If you want an ignorant, uninformed layperson's opinion, my money is on the atomist side. And the arguments I'd use are not entirely different from the ones Democritus used: mostly they hinge on inherent mathematical difficulties with the continuum.
One passage of Democritus that does survive is a dialogue between the intellect and the senses. The intellect starts out, saying: "By convention there is sweetness, by convention bitterness, by convention color, in reality only atoms and the void." In my book, this one line already puts Democritus shoulder-to-shoulder with Plato, Aristotle, or any other ancient philosopher you care to name. But the dialogue doesn't stop there. The senses respond, saying: "Foolish intellect! Do you seek to overthrow us, while it is from us that you take your evidence?"
I first came across this dialogue in a book by Schrödinger. Ah, Schrödinger! -- you see we're inching toward the "quantum computing" in the course title. We're gonna get there, don't worry about that.
But why would Schrödinger be interested in this dialogue? Well, Schrödinger was interested in a lot of things. He was not an intellectual monogamist (or really any kind of monogamist). But one reason he might've been interested is a certain equation he was involved with, which you've probably heard about:
i dψ/dt = Hψ
(Did I get it right, Ray?)
Actually, let me write it in a more correct form:
|ψt+1> = U |ψt>
What is this equation? Well, maybe you have to add a few details to it -- like the physics -- but once you do, it describes the evolution of a quantum pure state. For any isolated region of the universe that you want to consider, this equation describes the evolution in time of the state of that region, which we represent as a normalized linear combination -- a superposition -- of all the possible configurations of elementary particles in that region. So you can think of this equation as the sophisticated, modern version of Democritus's "atoms and the void." And as we all know, it does pretty well at the atoms and the void part.
The part where it maybe doesn't do so well is the "from us you take your evidence" part. Where's the "us"? Remember, the equation describes a superposition over all possible configurations of particles. So, I don't know -- are you in superposition? I don't feel like I am!
Incidentally, one thing I'm not going to do in this class is try to sell you on some favorite interpretation of quantum mechanics. You're free to believe any interpretation your conscience dictates. (What's my own view? Well, I agree with every interpretation to the extent it says there's a problem, and disagree with every interpretation to the extent it claims to have solved the problem!)
Anyway, just like we can classify religions as monotheistic and polytheistic, we can classify interpretations of quantum mechanics by where they come down on the "putting-yourself-in-coherent-superposition" issue. On the one side, we've got the interpretations that enthusiastically sweep the issue under the rug: Copenhagen and its Bayesian and epistemic grandchildren. In these interpretations, you've got your quantum system, you've got your measuring device, and there's a line between them. Sure, the line can shift from one experiment to the next, but for any given experiment, it's gotta be somewhere. In principle, you can even imagine putting other people on the quantum side, but you yourself are always on the classical side. Why? Because a quantum state is just a representation of your knowledge -- and you, by definition, are a classical being.
But what if you want to apply quantum mechanics to the whole universe, including yourself? The answer, in the epistemic-type interpretations, is simply that you don't ask that sort of question! Incidentally, that was Bohr's all-time favorite philosophical move, his WWF piledrive: "You're not allowed to ask such a question!"
On the other side we've got the interpretations that do try in different ways to make sense of putting yourself in superposition: many-worlds, Bohmian mechanics, etc.
Now, to hardheaded problem-solvers like ourselves, this might seem like a big dispute over words -- why bother? I actually agree with that: if it were just a dispute over words, then we shouldn't bother! But as David Deutsch pointed out in the late 1970's, we can conceive of experiments that would differentiate the first type of interpretation from the second type. The simplest experiment would just be to put yourself in coherent superposition and see what happens! Or if that's too dangerous, put someone else in coherent superposition. The point being that, if human beings were regularly being put into superposition, then the whole business of drawing a line between "classical observers" and the rest of the universe would become untenable.
But alright -- human brains are wet, goopy, sloppy things, and maybe we won't be able to maintain them in coherent superposition for 500 million years. So what's the next best thing? Well, we could try to put a computer in superposition. The more sophisticated the computer was -- the more it resembled something like a brain, like ourselves -- the further up we would have pushed the 'line' between quantum and classical. You can see how it's only a miniscule step from here to quantum computing.
I'd like to draw a more general lesson here. What's the point of talking about philosophical questions? Because we're going to be doing a fair bit of it in this class -- I mean, of philosophical bullshitting. Well, there's a standard answer, and that's that philosophy is an intellectual clean-up job -- the janitors who come in after the scientists have made a mess, to try and pick up the pieces. So on this view, philosophers should sit in their armchairs waiting for something surprising to happen in science -- like quantum mechanics, like the Bell inequality, like Gödel's Theorem -- and then swoop in like vultures and say, ah, this is what it really meant.
Well, on its face, that seems sort of boring. But as you get more accustomed to this sort of work, I think what you'll find is ... it's still boring!
Like most of you, I'm interested in results -- in finding solutions to nontrivial, well-defined open problems. So, what's the role of philosophy in that? I want to suggest a more exalted role than intellectual janitor: philosophy can be a scout. It can be an explorer -- mapping out intellectual terrain for science to later move in on, and build condominiums on or whatever. Not every branch of science was "scouted out ahead of time" by philosophy, but some of them were. And in recent history, I think quantum computing is really the poster child here. It's fine to tell people to "Shut up and calculate," but the question is, what should they calculate? At least here at IQC, the sorts of things we like to calculate -- capacities of quantum channels, error probabilities of quantum algorithms, etc. -- are things people would never have thought to calculate if not for philosophy.
Alright, I promised you a puzzle. Earlier I mentioned inherent mathematical difficulties with the continuum, so I've got a puzzle somewhat related to that. If it's too easy, let me know and I'll give you a harder one.
You know the real line, right? Suppose we want a union of open intervals that covers every rational point. Question: does the sum of the lengths of the intervals have to be infinite? One would certainly think so! After all, there are rational numbers pretty much everywhere!
[Richard Cleve immediately solves the puzzle.]
Alright, I guess that was too easy.
[Solution: Not only can the sum of the lengths of the intervals be finite, it can be arbitrarily close to zero! Simply enumerate the rational numbers, r0, r1, etc. Then put an interval of size ε/2i around ri for every i.]
Here's a harder one: we have the unit square, [0,1]2. Consider a function S, which maps every real number x in [0,1] to a countable subset S(x) of [0,1]. Can we choose S so that, for every (x,y) in [0,1]2, either y is in S(x) or x is in S(y)? | 1.5 | | SetsPHYS771 Lecture 2: Sets
Thursday's class started out with a brief presentation by Rahul Jain about atomist ideas in Jainism (circa 500BC). It seems the Jain (the ancient ones, not Rahul) were barking up more or less the same tree as Democritus, but their ideas (like many of the pre-Socratics') were mixed with generous helpings of mysticism.
I mentioned another example of East-West convergence: apparently, several ancient cultures independently came up with the same proof that A = π r2. It's obvious that the area of a circle should go like the radius2; the question is why the constant of proportionality (π) should be the same one that relates circumference to diameter. Proof by pizza: Cut a circle of radius r into thin pizza slices, and then "Sicilianize" (i.e. stack the slices into a rectangle of height r and length π r). QED.
One thing I forgot to share with you on Tuesday was a quote from Democritus:
"I would rather discover a single cause than become king of the Persians."
Something to keep in mind when you consider those job offers from Microsoft or Google...
Today we're gonna talk about sets. What will these sets contain? Other sets! Like a bunch of cardboard boxes that you open only to find more cardboard boxes, and so on all the way down.
You might ask, "how is this relevant to a class on quantum computing?" I can give three answers:
1. When I gave that puzzle on Tuesday (which by the way, we're going to "answer" today), some of you asked what "countable" means. OK, dude. Math is the foundation of all human thought, and set theory -- countable, uncountable, etc. -- that's the foundation of math. So even if this class was about Sanskrit literature, it should still probably start with set theory.
2. I have a hidden agenda: I'm told we have some physicists here, and I intend to browbeat you into thinking like mathematicians. I mean, what you do in the lab is your own business, but now you're in theorem country.
3. There actually is a tenuous connection between quantum computing and set theory, which I'll touch on in the next lecture. To give a sneak preview, the connection is that quantum mechanics applied to finite-dimensional systems (like qubits) seems like an interesting "intermediate" case between a continuous and a discrete theory. That is, it involves quantities (the amplitudes) that vary continuously, but that are not directly observable. In this way, it seems to avoid the "paradoxes" associated with the continuum in a way that other continuous physical theories do not. But what are those paradoxes? Well, welcome to my haunted house horror tour of the continuous and the transfinite...
So let's start with the empty set and see how far we get.
THE EMPTY SET.
Any questions so far?
Actually, before we talk about sets, we need a language for talking about sets. The language that Frege, Russell, and others developed is called first-order logic. It includes Boolean connectives (and, or, not), the equals sign, parentheses, variables, predicates, quantifiers ("there exists" and "for all") -- and that's about it. So for example, here are the Peano axioms for the nonnegative integers (where S(x) is the successor function, intuitively S(x)=x+1, and I'm assuming functions have already been defined):
* Zero Exists: There exists a z such that for all x, S(x) is not equal to z.
* Every Integer Has At Most One Predecessor: If S(x)=S(y) then x=y.
The nonnegative integers themselves are called a model for the axioms (though interestingly, they're not the only model).
Writing down these axioms seems like pointless hairsplitting -- and indeed, as someone pointed out in class, there's an obvious chicken-and-egg problem. How can we state axioms that will "put the integers on a more secure foundation," when the very symbols and so on that we're using to write down the axioms presuppose that we already know what the integers are? Well, precisely because of this point, I don't think that axioms and formal logic can be used to "place arithmetic on a more secure foundation" (whatever that would mean). But this stuff is still extremely interesting for at least three reasons:
1. The situation will change once we start talking not about integers, but about different sizes of infinity. There, writing down axioms and working out their consequences is pretty much all we have to go on!
2. Once we've formalized everything, we can then program a computer to reason for us:
* Premise 1: For all x, if A(x) is true then B(x) is true.
* Premise 2: There exists an x such that A(x) is true.
* Conclusion: There exists an x such that B(x) is true.
Well, you get the idea. The point is that deriving the conclusion from the premises is purely a syntactic operation -- one that doesn't require any understanding of what the statements mean.
3. Besides having a computer find proofs for us, we can also treat proofs themselves as mathematical objects, which opens the way to metamathematics.
Anyway, enough pussyfooting around. Let's see some axioms for set theory. (I'll state the axioms in English; converting them to first-order logic is left as an "exercise for the reader.")
* Empty Set: There exists an empty set.
* Extensionality: If two sets have the same members then they're equal.
* Pairing: For all sets x,y there exists a set {x,y}.
* Union: For all sets x, there exists a set equal to the union of all sets in x.
* Existence of Infinite Sets: There exists a set x that contains the empty set and that contains y union {y} for every y in x.
* Power Set: For all sets x there exists a set consisting of the subsets of x.
* Replacement (for every function A): For all sets x, there exists a set {A(y) | y in x}.
* Foundation: All nonempty sets x have a member y such that for all z, either z is not in x or z is not in y. (This is a technical axiom, whose point is to rule out sets like {{{{...}}}}.)
These axioms -- called the Zermelo-Fraenkel axioms -- are the foundation for basically all of math. So I thought you should see them at least once in your life.
Alright, one of the most basic questions we can ask about a set is, how big is it? What's its size, its cardinality? You might say, just count how many elements it has. But what if there are infinitely many? Are there more integers than even integers? This brings us to Georg Cantor (1845-1918), and the first of his several enormous contributions to human knowledge. He says, two sets have the same cardinality if and only if their elements can be put in one-to-one correspondence. Period. And if, whenever you try to pair off the elements, one set always has elements left over, the set with the elements left over is the bigger set.
What possible cardinalities are there? Of course there are finite ones, and then there's the first infinite cardinality, the cardinality of the integers, which Cantor called ("Aleph-Zero"). The rational numbers have the same cardinality , a fact that's also expressed by saying that the rational numbers are "countable" (i.e., can be placed in one-to-one correspondence with the integers).
What's the proof that the rational numbers are countable? You haven't seen it before? Oh, alright. First list all the rational numbers where the sum of the numerator and the denominator is 2. Then list all the rational numbers where the sum of the numerator and the denominator is 3. And so on. It's clear that every rational number will eventually appear in this list. Hence there's only a countable infinity of them. QED.
But Cantor's biggest contribution was to show that not every infinity is countable -- so for example, the infinity of real numbers is greater than the infinity of integers. More generally, just as there are infinitely many numbers, there are also infinitely many infinities.
You haven't seen the proof of that either? Alright, alright. Let's say you have an infinite set A. We'll show how to produce another infinite set, B, which is even bigger than A. This B will simply be the set of all subsets of A (which is guaranteed to exist by the Zermelo-Fraenkel axioms). How do we know B is bigger than A? Well, suppose we could pair off every element a in A with an element f(a) in B, in such a way that no elements of B were left over. Then we could define a new subset S of A, consisting of every a that's not contained in f(a). Notice that this S can't have been paired off with any a in A -- since otherwise, a would be contained in f(a) if and only if it wasn't contained in f(a), contradiction. Therefore B is larger than A, and we've ended up with a bigger infinity than the one we started with.
This is certainly one of the four or five greatest proofs in all of math -- again, good to see at least once in your life.
Besides cardinal numbers, it's also useful to discuss ordinal numbers. Rather than defining these, it's easier to just illustrate them. We start with the natural numbers:
1, 2, 3, ...
Then we say, let's define something that's greater than every natural number:
ω
What comes after omega?
ω+1, ω+2, ...
Now, what comes after all of these?
2ω
Alright, we get the idea:
3ω, 4ω, ...
Alright, we get the idea:
ω2, ω3, ...
Alright, we get the idea:
ωω, ωω^ω, ...
We could go on for quite a while! Basically, for any set of ordinal numbers (finite or infinite), we stipulate that there's a first ordinal number that comes after everything in that set.
The set of ordinal numbers has the important property of being well-ordered, which means that every subset has a minimum element.
Now, here's something interesting. All of the ordinal numbers I've listed have a special property, which is that they have at most countably many predecessors (i.e., at most of them). What if we consider the set of all ordinals with at most countably many predecessors? Well, that set also has a successor, call it α. But does α itself have predecessors? Certainly not, since otherwise α wouldn't be the successor to the set; it would be in the set! The set of predecessors of α has the next possible cardinality, which is called .
What this sort of argument proves is that the set of cardinalities is itself well-ordered. After the infinity of the integers, there's a "next bigger infinity," and a "next bigger infinity after that," and so on. You never see an infinite decreasing sequence of infinities, as you do with the real numbers.
So, starting from (the cardinality of the integers), we've seen two different ways to produce "bigger infinities than infinity." One of those ways yields the cardinality of sets of integers (or equivalently, the cardinality of real numbers), which we denote 2. The other way yields . Is 2 equal to ? Or to put it another way: is there any infinity of intermediate size between the infinity of the integers and the infinity of the reals?
(Note: No sooner had I revealed that there were more reals than integers, than a student actually asked this. He claimed never to have heard of the question before; he thought he was just asking for a technical clarification.)
Well, the question of whether there are any "intermediate" infinities between the integers and the reals was David Hilbert's first problem in his famous 1900 address. It stood as one of the great math problems for over half a century, until it was finally "solved" (in a rather disappointing way, as you'll see).
Cantor himself believed there were no intermediate infinities, and called this conjecture the Continuum Hypothesis. Cantor was extremely frustrated with himself for not being able to prove it.
Besides the Continuum Hypothesis, there's another statement about these infinite sets that no one could prove or disprove from the Zermelo-Fraenkel axioms. This statement is the infamous Axiom of Choice. It says that, if you have a (possibly infinite) set of sets, then it's possible to form a new set by choosing one item from each set. Sound reasonable? Well, if you accept it, you also have to accept that there's a way to cut a solid sphere into a finite number of pieces, and then rearrange those pieces into another solid sphere a thousand times its size. (That's the "Banach-Tarski paradox." Admittedly, the "pieces" are a bit hard to cut out with a knife...)
Why does the Axiom of Choice have such dramatic consequences? Basically, because it asserts that certain sets exist, but without giving any rule for forming those sets. As Bertrand Russell put it: "To choose one sock from each of infinitely many pairs of socks requires the Axiom of Choice, but for shoes the Axiom is not needed." (What's the difference?)
The Axiom of Choice turns out to be equivalent to the statement that every set can be well-ordered: in other words, the elements of any set can be paired off with the ordinals 1, 2, ..., ω, ω+1, ... 2ω, 3ω, ... If you think (for example) about the set of real numbers, this seems far from obvious.
It's easy to see that well-ordering implies the Axiom of Choice: just well-order the whole infinity of socks, then choose the sock from each pair that comes first in the ordering.
Do you want to see the other direction: why the Axiom of Choice implies that every set can be well-ordered? Yes?
OK! We have a set A that we want to well-order. For every proper subset B of A, we'll use the Axiom of Choice to pick an element f(B) in A\B. Now we can start well-ordering A, as follows: first let s0 = f({}), then let s1 = f({s0}), s2 = f({s0,s1}), and so on.
Can this process go on forever? No, it can't. For if it did, then by a process of "transfinite induction," we could stuff arbitrarily large infinite cardinalities into A. And while admittedly A is infinite, it has at most a fixed infinite size! So the process has to stop somewhere. But where? At a proper subset B of A? No, it can't do that either -- since if it did, then we'd just continue the process by adding f(B). So the only place it can stop is A itself. Therefore A can be well-ordered.
OK, should we come back to the puzzle from Tuesday? We have a box, [0,1]2. To each real number x in [0,1], we associate a countable subset S(x) of [0,1]. Now, can we choose S in such a way that for every (x,y) pair, either y is in S(x) or x is in S(y)? What do you think?
I'll give you two answers: that it isn't possible, and that it is possible. Which answer do you want to see first?
Alright, we'll start with why it isn't possible. For this I'll assume that the Continuum Hypothesis is false. Then there's some subset A of [0,1] that has cardinality . Let B be the union of S(x) over all x in A. Then B also has cardinality . So, since we assumed that is less than 2, there must be some y not in B. Now observe that there are real numbers x in A, but none of them satisfy y in S(x), and only < of them can satisfy x in S(y).
Now let's see why it is possible. For this I want to assume both the Axiom of Choice and the Continuum Hypothesis. By the Axiom of Choice, we can well-order the real numbers in [0,1]. Also, by the Continuum Hypothesis, there are only of these real numbers -- which means that in our well-ordering, every number has at most predecessors. So put y in S(x) if and only if y<=x, where <= means with respect to the well-ordering (not the usual ordering on real numbers). Then for every (x,y), clearly either y is in S(x) or x is in S(y).
Today's puzzle is about the power of self-esteem and positive thinking. Is there any theorem that you can only prove by assuming as an axiom that the theorem can be proved?
| 1.5 | | Gödel, Turing, and FriendsOn Thursday, I probably should've told you explicitly that I was compressing a whole math course into one lecture. On the one hand, that means I don't really expect you to have understood everything. On the other hand, to the extent you did understand -- hey! You got a whole math course in one lecture! You're welcome.
But I do realize that in the last lecture, I went too fast in some places. In particular, I wrote an example of logical inference on the board. The example was, if all A's are B's, and there is an A, then there is a B. I'm told that the physicists were having trouble with that?
Hey, I'm just ribbin' ya. If you haven't seen this way of thinking before, then you haven't seen it. But maybe, for the benefit of the physicists, we should go over the basic rules of logic?
* Propositional Tautologies: A or not A, not(A and not A), etc. are valid.
* Modus Ponens: If A is valid and A implies B is valid then B is valid.
* Equality Rules: x=x, x=y implies y=x, x=y and y=z implies x=z, and x=y implies f(x)=f(y) are all valid.
* Change of Variables: Changing variable names leaves a statement valid.
* Quantifier Elimination: If For all x, A(x) is valid, then A(y) is valid for any y.
* Quantifier Addition: If A(y) is valid where y is an unrestricted variable, then For all x, A(x) is valid.
* Quantifier Rules: If Not(For all x, A(x)) is valid, then There exists an x such that Not(A(x)) is valid. Etc.
There's an amazing result called Gödel's Completeness Theorem, which says that these rules are all you ever need. In other words: if, starting from some set of axioms, you can't derive a contradiction using these rules, then the axioms must have a model (i.e., they must be consistent). Conversely, if the axioms are inconsistent, then the inconsistency can be proven using these rules alone.
Think about what that means. It means that Fermat's Last Theorem, the Poincaré Conjecture, or any other mathematical achievement you care to name can be proved by starting from the axioms for set theory, and then applying these piddling little rules over and over again. Probably 300 million times, but still...
(How does Gödel prove the Completeness Theorem? The proof has been described as "extracting semantics from syntax." We simply "cook up objects to order" as the axioms request them! And if we ever run into an inconsistency, that can only be because there was an inconsistency in the original axioms.)
One immediate consequence of the Completeness Theorem is the Löwenheim-Skolem Theorem: every set of axioms has a model of at most countable cardinality. (Note: One of the best predictors of success in mathematical logic is having an umlaut in your name.) Why? Because the process of "cooking up objects to order as the axioms request them" can only go on for a countably infinite number of steps!
It's a shame that, after proving his Completeness Theorem, Gödel never really did anything else of note. [Pause for comic effect] Well, alright, I guess a year later he proved the Incompleteness Theorem. See, the Completeness Theorem was his Master's thesis, and the Incompleteness Theorem was his PhD thesis. Apparently, one of his PhD examiners didn't want to give him a degree because the PhD thesis was "too similar to the Master's thesis."
The Incompleteness Theorem says that, given any consistent, computable set of axioms, there's a true statement about the integers that can never be proved from those axioms. Here consistent means that you can't derive a contradiction, while computable means that either there are finitely many axioms, or else if there are infinitely many, at least there's an algorithm to generate all the axioms.
(If we didn't have the computability requirement, then we could simply take our "axioms" to consist of all true statements about the integers! In practice, that isn't a very useful set of axioms.)
But wait! Doesn't the Incompleteness Theorem contradict the Completeness Theorem, which says that any statement that's entailed by the axioms can be proved from the axioms? Hold that question; we're gonna clear it up later.
First, though, let's see how the Incompleteness Theorem is proved. People always say, "the proof of the Incompleteness Theorem was a technical tour de force, it took 30 pages, it requires an elaborate construction involving prime numbers," etc. Unbelievably, 80 years after Gödel, that's still how the proof is presented in math classes!
Alright, should I let you in on a secret? The proof of the Incompleteness Theorem is about two lines. The caveat is that, to give the two-line proof, you first need the concept of a computer.
When I was in junior high, I had a friend who was really good at math, but maybe not so good at programming. He wanted to write a program using arrays, but he didn't know what an array was. So what did he do? He associated each element of the array with a unique prime number, then he multiplied them all together; then, whenever he wanted to read something out of the array, he factored the product. (If he was programming a quantum computer, maybe that wouldn't be quite so bad!) Anyway, what my friend did, that's basically what Gödel did. He made up an elaborate hack in order to program without programming.
Turing Machines
OK, time to bring Mr. T. on the scene. How many of you have seen Turing machines before? About three-quarters of you? I'll go pretty quickly then.
In 1936, the word "computer" meant a person (usually a woman) whose job was to compute with pencil and paper. Turing wanted to show that, in principle, such a "computer" could be simulated by a machine. What would the machine look like? Well, it would have to able to write down its calculations somewhere. Since we don't really care about handwriting, font size, etc., it's easiest to imagine that the calculations are written on a sheet of paper divided into squares, with one symbol per square, and a finite number of possible symbols. Traditionally paper has two dimensions, but without loss of generality we can imagine a long, one-dimensional paper tape. How long? For the time being, we'll assume as long as we need.
What can the machine do? Well, clearly it has to be able to read symbols off the tape and modify them based on what it reads. We'll assume for simplicity that the machine reads only one symbol at a time. But in that case, it had better be able to move back and forth on the tape. It would also be nice if, once it's computed an answer, the machine can halt! But at any time, how does the machine decide which things to do? According to Turing, this decision should depend only on two pieces of information: (1) the symbol currently being read, and (2) the machine's current "internal configuration" or "state." Based on its internal state and the symbol currently being read, the machine should (1) write a new symbol in the current square, (2) move backwards or forwards one square, and (3) switch to a new state or halt.
Finally, since we want this machine to be physically realizable, the number of possible internal states should be finite. These are the only requirements.
Turing's first result is the existence of a "universal" machine: a machine whose job is to simulate any other machine described via symbols on the tape. In other words, universal programmable computers can exist. You don't have to build one machine for email, another for playing DVD's, another for Tomb Raider, and so on: you can build a single machine that simulates any of the other machines, by running different programs stored in memory. This result is actually a lemma, which Turing uses to prove his "real" result.
So what's the real result? It's that there's a basic problem, called the halting problem, that no program can ever solve. The halting problem is this: we're given a program, and we want to decide if it ever halts. Of course we can run the program for a while, but what if the program hasn't halted after a million years? At what point should we give up?
One piece of evidence that this problem might be hard is that, if we could solve it, then we could also solve many famous unsolved math problems. For example, Goldbach's Conjecture says that every even number 4 or greater can be written as a sum of two primes. Now, we can easily write a program that tests 4, 6, 8, and so on, halting only if it finds a number that can't be written as a sum of two primes. Then deciding whether that program ever halts is equivalent to deciding the truth of Goldbach's Conjecture.
But can we prove there's no program to solve the halting problem? This is what Turing does. His key idea is not even to try to analyze the internal dynamics of such a program, supposing it existed. Instead he simply says, suppose by way of contradiction that such a program P exists. Then we can modify P to produce a new program P' that does the following. Given another program Q as input, P'
1. runs forever if Q halts given its own code as input, or
2. halts if Q runs forever given its own code as input.
Now we just feed P' its own code as input. By the conditions above, P' will run forever if it halts, or halt if it runs forever. Therefore P' -- and by implication P -- can't have existed in the first place.
As I said, once you have Turing's results, Gödel's results fall out for free as a bonus. Why? Well, suppose the Incompleteness Theorem was false -- that is, there existed a consistent, computable proof system F from which any statement about integers could be either proved or disproved. Then given a computer program, we could simply search through every possible proof in F, until we found either a proof that the program halts or a proof that it doesn't halt. (This is possible because the statement that a particular computer program halts is ultimately just a statement about integers.) But this would give us an algorithm to solve the halting problem, which we already know is impossible. Therefore F can't exist.
By thinking more carefully, we can actually squeeze out a stronger result. Let P be a program that, given as input another program Q, tries to decide whether Q halts by the strategy above (i.e., searching through every possible proof and disproof that Q halts in some formal system F). Then as in Turing's proof, suppose we modify P to produce a new program P' that
1. runs forever if Q is proved to halt given its own code as input, or
2. halts if Q is proved to run forever given its own code as input.
Now suppose we feed P' its own code as input. Then we know that P' will run forever, without ever discovering a proof or disproof that it halts. For P' finds a proof that it halts, then it will run forever, and if it finds a proof that it runs forever, then it will halt, which is a contradiction.
But there's an obvious paradox: why isn't the above argument, itself, a proof that P' will run forever given its own code as input? And why won't P' discover this proof that it runs forever -- and therefore halt, and therefore run forever, and therefore halt, etc.?
The answer is that, in "proving" that P' runs forever, we made a hidden assumption: namely that the proof system F is consistent. If F was inconsistent, then there could perfectly well be a proof that P' halts, even if the reality was that P' ran forever.
But this means that, if F could prove that F was consistent, then F could also prove that P' ran forever -- thereby bringing back the above contradiction. The only possible conclusion is that if F is consistent, then F can't prove its own consistency. This result is sometimes called Gödel's Second Incompleteness Theorem.
The Second Incompleteness Theorem establishes what we maybe should have expected all along: that the only mathematical theories pompous enough to prove their own consistency, are the ones that don't have any consistency to brag about! If we want to prove that a theory F is consistent, then we can only do it within a more powerful theory -- a trivial example being F+Con(F) (F plus the axiom that F is consistent). But then how do we know that F+Con(F) is itself consistent? Well, we can only prove that in a still stronger theory: F+Con(F)+Con(F+Con(F)) )(F+Con(F) plus the axiom that F+Con(F) is consistent). And so on infinitely. (Indeed, even beyond infinitely, into the countable ordinals.)
To take a concrete example: the Second Incompleteness Theorem tells us that the most popular axiom system for the integers, Peano Arithmetic, can't prove its own consistency. Or in symbols, PA can't prove Con(PA). If we want to prove Con(PA), then we need to move to a stronger axiom system, such as ZF (the Zermelo-Fraenkel axioms for set theory). In ZF we can prove Con(PA) pretty easily, by using the Axiom of Infinity to conjure up an infinite set that then serves as a model for PA.
On the other hand, again by the Second Incompleteness Theorem, ZF can't prove its own consistency. If we want to prove Con(ZF), the simplest way to do it is to posit the existence of infinities bigger than anything that can be defined in ZF. Such infinities are called "large cardinals." (When set theorists say large, they mean large.) Once again, we can prove the consistency of ZF in ZF+LC (where LC is the axiom that large cardinals exist). But if we want to prove that ZF+LC is itself consistent, then we need a still more powerful theory, such as one with even bigger infinities.
A quick question to test your understanding: while we can't prove in PA that Con(PA), can we least prove in PA that Con(PA) implies Con(ZF)?
No, we can't. For then we could also prove in ZF that Con(PA) implies Con(ZF). But since ZF can prove Con(PA), this would mean that ZF can prove Con(ZF), which contradicts the Second Incompleteness Theorem.
I promised to explain why the Incompleteness Theorem doesn't contradict the Completeness Theorem. The easiest way to do this is probably through an example. Consider the "self-hating theory" PA+Not(Con(PA)), or Peano Arithmetic plus the assertion of its own inconsistency. We know that if PA is consistent, then this strange theory must be consistent as well -- since otherwise PA would prove its own consistency, which the Incompleteness Theorem doesn't allow. It follows, by the Completeness Theorem, that PA+Not(Con(PA)) must have a model. But what could such a model possibly look like? In particular, what you happen if, within that model, you just asked to see the proof that PA was inconsistent?
I'll tell you what would happen: the axioms would tell you that proof of PA's inconsistency is encoded by a positive integer X. And then you would say, "but what is X?" And the axioms would say, "X." And you would say, "But what is X, as an ordinary positive integer?"
"No, no, no! Talk to the axioms."
"Alright, is X greater or less than 10500,000?"
"Greater." (The axioms aren't stupid: they know that if they said "smaller", then you could simply try every smaller number and verify that none of them encode a proof of PA's inconsistency.)
"Alright then, what's X+1?"
"Y."
And so on. The axioms will keep cooking up fictitious numbers to satisfy your requests, and assuming that PA itself is consistent, you'll never be able to trap them in an inconsistency. The point of the Completeness Theorem is that the whole infinite set of fictitious numbers the axioms cook up will constitute a model for PA -- just not the usual model (i.e., the ordinary positive integers)! If we insist on talking about the usual model, then we switch from the domain of the Completeness Theorem to the domain of the Incompleteness Theorem.
Do you remember the puzzle from Thursday? The puzzle was whether there's any theorem that can only be proved by assuming as an axiom that it can be proved. In other words, does "just believing in yourself" make any formal difference in mathematics? We're now in a position to answer that question.
Let's suppose, for concreteness, that the theorem we want to prove is the Riemann Hypothesis (RH), and the formal system we want to prove it in is Zermelo-Fraenkel set theory (ZF). Suppose we can prove in ZF that, if ZF proves RH, then RH is true. Then taking the contrapositive, we can also prove in ZF that if RH is false, then ZF does not prove RH. In other words, we can prove in ZF+not(RH) that not(RH) is perfectly consistent with ZF. But this means that the theory ZF+not(RH) proves its own consistency -- and this, by Gödel, means that ZF+not(RH) is inconsistent. But saying that ZF+not(RH) is inconsistent is equivalent to saying that RH is a theorem of ZF. Therefore we've proved RH. In general we find that, if a statement can be proved by assuming as an axiom that it's provable, then it can also be proved without assuming that axiom. This result is known as Löb's Theorem (again with the umlauts), though personally I think that a better name would be the "You-Had-The-Mojo-All-Along Theorem."
Oh, you remember on Thursday we talked about the Axiom of Choice and the Continuum Hypothesis? Natural statements about the continuum that, since the continuum is such a well-defined mathematical entity, must certainly be either true or false? So, how did those things ever get decided? Well, Gödel proved in 1939 that assuming the Axiom of Choice (AC) or the Continuum Hypothesis (CH) can never lead to an inconsistency. In other words, if the theories ZF+AC or ZF+CH were inconsistent, that could only be because ZF itself was inconsistent.
This raised an obvious question: can we also consistently assume that AC and CH are false? Gödel worked on this problem but wasn't able to answer it. Finally Paul Cohen gave an affirmative answer in 1963, by inventing a new technique called "forcing." (For that, he won the only Fields Medal that's ever been given for set theory and the foundations of math.)
So, we now know that the usual axioms of mathematics don't decide the Axiom of Choice and the Continuum Hypothesis one way or another. You're free to believe both, neither, or one and not the other without fear of contradiction. And sure enough, opinion among mathematicians about AC and CH remains divided to this day, with many interesting arguments for and against (which we unfortunately don't have time to explore the details of).
Let me end with a possibly-surprising observation: the independence of AC and CH from ZF set theory is itself a theorem of Peano Arithmetic. For, ultimately, Gödel and Cohen's consistency theorems boil down to combinatorial assertions about manipulations of first-order sentences -- which can in principle be proven directly, without ever thinking about the transfinite sets that those sentences purport to describe. This provides a nice illustration of what, to me, is the central philosophical question underlying this whole business: do we ever really talk about the continuum, or do we only ever talk about finite sequences of symbols that talk about the continuum?
Bonus Addendum
What does any of this have to do with quantum mechanics? I will now attempt the heroic task of making a connection. What I've tried to impress on you is that there are profound difficulties if we want to assume the world is continuous. Take this pen, for example: how many different positions can I put it in on the surface of the table? ? More than ? Less than ? We don't want the answers to "physics" questions to depend on the axioms of set theory!
Ah, but you say my question is physically meaningless, since the pen's position could never actually be measured to infinite precision? Sure -- but the point is, you need a physical theory to tell you that!
Of course, quantum mechanics gets its very name from the fact that a lot of the observables in the theory, like energy levels, are discrete -- "quantized." This seems paradoxical, since one of the criticisms that computer scientists level against quantum computing is that, as they see it, it's a continuous model of computation!
My own view is that quantum mechanics, like classical probability theory, should be seen as somehow "intermediate" between a continuous and discrete theory. (Here I'm assuming that the Hilbert space or probability space are finite-dimensional.) What I mean is that, while there are continuous parameters (the probabilities or amplitudes respectively), those parameters are not directly observable, and that has the effect of "shielding" us from the bizarro universe of the Axiom of Choice and the Continuum Hypothesis. We don't need a detailed physical theory to tell us that whether amplitudes are rational or irrational, whether there are more or less than possible amplitudes, etc., are physically meaningless questions. This follows directly from the fact that, if we wanted to learn an amplitude exactly, then (even assuming no error!) we would need to measure the appropriate state infinitely many times.
Homework Assignment
Let BB(n), or the "nth Busy Beaver number," be the maximum number of steps that an n-state Turing machine can make on an initially blank tape before halting. (Here the maximum is over all n-state Turing machines that eventually halt.)
1. Prove that BB(n) grows faster than any computable function.
2. Let S = 1/BB(1) + 1/BB(2) + 1/BB(3) + ...
Is S a computable real number? In other words, is there an algorithm that, given as input a positive integer k, outputs a rational number S' such that |S-S'|<1/k?
| 1.5 | | Minds and MachinesToday we're going to launch into something I know you've all been waiting for: a philosophical foodfight about minds, machines, and intelligence!
First, though, let's finish talking about computability. One concept we'll need again and again in this class is that of an oracle. The idea is a pretty obvious one: we assume we have a "black box" or "oracle" that immediately solves some hard computational problem, and then see what the consequences are! (When I was a freshman, I once started talking to my professor about the consequences of a hypothetical "NP-completeness fairy": a being that would instantly tell you whether a given Boolean formula was satisfiable or not. The professor had to correct me: they're not called "fairies"; they're called "oracles." Much more professional!)
Oracles were apparently first studied by Turing, in his 1938 PhD thesis. Obviously, anyone who could write a whole thesis about these fictitious entities would have to be an extremely pure theorist, someone who wouldn't be caught dead doing anything relevant. This was certainly true in Turing's case -- indeed he spent the years after his PhD, from 1939 to 1943, studying certain abstruse symmetry transformations on a 26-letter alphabet.
Anyway, we say that problem A is Turing-reducible to problem B, if A is solvable by a Turing machine given an oracle for B. In other words, "A is no harder than B": if we had a hypothetical device to solve B, then we could also solve A. Two problems are Turing-equivalent if each is Turing-reducible to the other. So for example, the problem of whether a statement can be proved from the axioms of set theory is Turing-equivalent to the halting problem: if you can solve one, you can solve the other.
Now, a Turing-degree is the set of all problems that are Turing-equivalent to a given problem. What are some examples of Turing-degrees? Well, we've already seen two examples: (1) the set of computable problems, and (2) the set of problems that are Turing-equivalent to the halting problem. Saying that these Turing-degrees aren't equal is just another way of saying that the halting problem isn't solvable.
Are there any Turing-degrees above these two? In other words, is there any problem even harder than the halting problem? Well, consider the following "super halting problem": given a Turing machine with an oracle for the halting problem, decide if it halts! Can we prove that this super halting problem is unsolvable, even given an oracle for the ordinary halting problem? Yes, we can! We simply take Turing's original proof that the halting problem is unsolvable, and "shift everything up a level" by giving all the machines an oracle for the halting problem. Everything in the proof goes through as before, a fact we express by saying that the proof "relativizes."
Here's a subtler question: is there any problem of intermediate difficulty between the computable problems and the halting problem? This question was first asked by Emil Post in 1944, and was finally answered in 1956, by Richard Friedberg in the US and (independently) A. A. Muchnik in the USSR. The answer is yes. Indeed, Friedberg and Muchnik actually proved a stronger result: that there are two problems A and B, both of which are solvable given an oracle for the halting problem, but neither of which is solvable given an oracle for the other. These problems are constructed via an infinite process whose purpose is to kill off every Turing machine that might reduce A to B or B to A. Unfortunately, the resulting problems are extremely contrived; they don't look like anything that might arise in practice. And even today, we don't have a single example of a "natural" problem with intermediate Turing degree.
Since Friedberg and Muchnik's breakthrough, the structure of the Turing degrees has been studied in more detail than you can possibly imagine. Here's one of the simplest questions: if two problems A and B are both reducible to the halting problem, then must there be a problem C that's reducible to A and B, such that any problem that's reducible to both A and B is also reducible to C? Hey, whatever floats your boat! But this is the point where some of us say, maybe we should move on to the next topic... (Incidentally, the answer to the question is no.)
Alright, the main philosophical idea underlying computability is what's called the Church-Turing Thesis. It's named after Turing and his adviser Alonzo Church, even though what they themselves believed about "their" thesis is open to dispute! Basically, the thesis is that any function "naturally to be regarded as computable" is computable by a Turing machine. Or in other words, any "reasonable" model of computation will give you either the same set of computable functions as the Turing machine model, or else a proper subset.
Already there's an obvious question: what sort of claim is this? Is it an empirical claim, about which functions can be computed in physical reality? Is it a definitional claim, about the meaning of the word "computable"? Is it a little of both?
Well, whatever it is, the Church-Turing Thesis can only be regarded as extremely successful, as theses go. As you know -- and as we'll discuss later -- quantum computing presents a serious challenge to the so-called "Extended" Church-Turing Thesis: that any function naturally to be regarded as efficiently computable is efficiently computable by a Turing machine. But in my view, so far there hasn't been any serious challenge to the original Church-Turing Thesis -- neither as a claim about physical reality, nor as a definition of ‘computable.'
There have been plenty of non-serious challenges to the Church-Turing Thesis. In fact there are whole conferences and journals devoted to these challenges -- google "hypercomputation." I've read some of this stuff, and it's mostly along the lines of, well, suppose you could do the first step of a computation in one second, the next step in a half second, the next step in a quarter second, the next step in an eighth second, and so on. Then in two seconds you'll have done an infinite amount of computation! Well, as stated it sounds a bit silly, so maybe sex it up by throwing in a black hole or something. How could the hidebound Turing reactionaries possibly object? (It reminds me of the joke about the supercomputer that was so fast, it could do an infinite loop in 2.5 seconds.)
We should immediately be skeptical that, if Nature was going to give us these vast computational powers, she would do so in a way that's so mundane, so uninteresting. Without making us sweat or anything. But admittedly, to really see why the hypercomputing proposals fail, you need the entropy bounds of Bekenstein, Bousso, and others -- which are among the few things the physicists think they know about quantum gravity, and which hopefully we'll say something about later in the course. So the Church-Turing Thesis -- even its original, non-extended version -- really is connected to some of the deepest questions in physics. But in my opinion, neither quantum computing, nor analog computing, nor anything else has mounted a serious challenge to that thesis in the seventy years since it was formulated.
If we interpret the Church-Turing Thesis as a claim about physical reality, then it should encompass everything in that reality, including the goopy neural nets between your respective ears. This leads us, of course, straight into the cratered intellectual battlefield that I promised to lead you into.
As a historical remark, it's interesting that the possibility of thinking machines isn't something that occurred to people gradually, after they'd already been using computers for decades. Instead it occurred to them immediately, the minute they started talking about computers themselves. People like Leibniz and Babbage and Lovelace and Turing and von Neumann understood from the beginning that a computer wouldn't just be another steam engine or toaster -- that, because of the property of universality (whether or not they called it that), it's difficult even to talk about computers without also talking about ourselves.
So, I asked you to read Turing's second famous paper, Computing Machinery and Intelligence. Reactions?
What's the main idea of this paper? As I read it, it's a plea against meat chauvinism. Sure, Turing makes some scientific arguments, some mathematical arguments, some epistemological arguments. But beneath everything else is a moral argument. Namely: if a computer interacted with us in a way that was indistinguishable from a human, then of course we could say the computer wasn't "really" thinking, that it was just a simulation. But on the same grounds, we could also say that other people aren't really thinking, that they merely act as if they're thinking. So what is it that entitles us to go through such intellectual acrobatics in the one case but not the other?
If you'll allow me to editorialize (as if I ever do otherwise...), this moral question, this question of double standards, is really where Searle, Penrose, and every other "strong AI skeptic" comes up empty for me. One can indeed give weighty and compelling arguments against the possibility of thinking machines. The only problem with these arguments is that they're also arguments against the possibility of thinking brains!
So for example: one popular argument is that, if a computer appears to be intelligent, that's merely a reflection of the intelligence of the humans who programmed it. But what if humans' intelligence is just a reflection of the billion-year evolutionary process that gave rise to it? What frustrates me every time I read the AI skeptics is their failure to consider these parallels honestly. The "qualia" and "aboutness" of other people is simply taken for granted. It's only the qualia of machines that's ever in question.
But perhaps a skeptic could retort: I believe other people think because I know I think, and other people look sort of similar to me -- they've also got five fingers, hair in their armpits, etc. But a robot looks different -- it's made of metal, it's got an antenna, it lumbers across the room, etc. So even if the robot acts like it's thinking, who knows? But if I accept this argument, why not go further? Why can't I say, I accept that white people think, but those blacks and Asians, who knows about them? They look too dissimilar from me.
In my view, one can divide everything that's been said about artificial intelligence into two categories: the 70% that's somewhere in Turing's paper from 1950, and the 30% that's emerged from a half-century of research since then.
So, after 56 years, there are some things we can say that would've surprised Alan Turing. What are those things? Well, one of them is how little progress has been made, compared to what he expected! Do you remember, Turing made a falsifiable prediction?
I believe that in about fifty years' time it will be possible, to programme computers, with a storage capacity of about 109, to make them play the imitation game so well that an average interrogator will not have more than 70 per cent chance of making the right identification after five minutes of questioning.
How well has his prediction fared? First let's note that the prediction about computers themselves was damn good. Turing predicted that in 50 years' time (i.e., 2000), we'd be programming computers with a storage capacity of about 109 (i.e., one gig).
But what about programming the computers to pass the imitation game? How well has Turing's prediction fared there?
Well, some of you might have heard of a program called ELIZA, written by Joseph Weizenbaum in 1966. This program simulates a psychotherapist who keeps spitting back whatever you said. The amazing thing Weizenbaum found is that many people will spill their hearts out to this program! And sometimes, if you then tell them they were talking to a program (and an extremely simple one at that), they won't believe you.
A few years ago, someone had the brilliant idea to take the original ELIZA program and let it loose in AOL chat rooms to see what happened. If you go to fury.com/aoliza, you can see some hilarious (fortunately anonymized) examples of people trying to seduce the program, telling it about their infidelities, etc. Here's one of my favorite exchanges, from a guy who (before moving on to amorous solicitations) had told the program that he planned to finish his B of A and then "move into corporate business alliance with Starbucks":
Guy: I think that u are getting off on playing with my mind
ELIZA: your mind?
Guy: frustrating me by not joining in my fantasy
ELIZA: your fantasy?
Guy: about having mind blowing sex with u
ELIZA: does that have anything to do with the fact that your b of a and then you will move into corporate business alliance with starbucks?
So this is about the state of the art in terms of man-machine repartee. It seems one actually needs to revise the Turing Test, to say that, if we want to verify intelligence in a computer, then we need some minimal level of intelligence in the human interrogator.
Question from the floor: But isn't the problem that these guys assumed they were talking to a human? To satisfy the conditions of the Turing Test, wouldn't we need to tell them they were trying to distinguish a human from a computer?
Response: Excellent question! Yes, for exactly that reason, this isn't a real Turing Test; it's merely amusing. However, for the past 15 years a guy named Hugh Loebner has been running a contest that's closer to what Turing had in mind. Here the testers are told that they're trying to distinguish humans from computers -- yet many of the transcripts have been just as depressing, both from the standpoint of machine intelligence and from that of human intelligence. (E.g., a woman who tried to converse intelligently about Shakespeare got classified as a computer, since "no human would know that much about Shakespeare...")
Another question from the floor: What if we had a computer doing the interrogation instead of a human?
Response: Another excellent question! As it turns out, that's not at all a hypothetical situation. A few days ago, a guy named Luis von Ahn won a MacArthur award for (among other things) his work on CAPTCHA's, which are those tests that websites use to distinguish legitimate users from spambots. I'm sure you've encountered them -- you know, the things where you see those weird curvy letters that you have to retype. The key property of these tests is that a computer should be able to generate and grade them, but not pass them! (A lot like professors making up a midterm...) Only humans should be able to pass the tests. So basically, these tests capitalize on the failures of AI. (Well, they also capitalize on the computational hardness of inverting one-way functions, which we'll get to later.)
One interesting aspect of CAPTCHA's is that they've already led to an arms race between the CAPTCHA programmers and the AI programmers. When I was at Berkeley, some of my fellow grad students wrote a program that broke a CAPTCHA called Gimpy maybe 30% of the time. So then the CAPTCHA's have to be made harder, and then the AI people get back to work, and so on. Who will win?
You see: every time you set up a Yahoo Mail account, you're directly confronting age-old mysteries about what it means to be human...
Despite what I said about the Turing Test, there have been some dramatic successes of AI. We all know about Kasparov and Deep Blue. Maybe less well-known is that, in 1996, a program called Otter was used to solve a 60-year-old open problem in algebra called the Robbins Conjecture, which Tarski and other famous mathematicians had worked on. (Apparently, for decades Tarski would give the problem to his best students. Then, eventually, he started giving it to his worst students...) The problem is easy to state: given the three axioms
* A or (B or C) = (A or B) or C
* A or B = B or A
* Not(Not(A or B) or Not(A or Not(B))) = A,
can one derive as a consequence that Not(Not(A)) = A?
Let me stress that this was not a case like Appel and Haken's proof of the Four-Color Theorem, where the computer's role was basically to check thousands of cases. In this case, the proof was 17 lines long. A human could check the proof by hand, and say, yeah, I could've come up with that. (In principle!)
What else? Arguably there's a pretty sophisticated AI system that almost all of you used this morning and will use many more times today. What is it? Right, Google.
You can look at any of these examples -- Deep Blue, the Robbins conjecture, Google -- and say, that's not really AI. That's just massive search, helped along by clever programming. Now, this kind of talk drives AI researchers up a wall. They say: if you told someone in the sixties that in 30 years we'd be able to beat the world grandmaster at chess, and asked if that would count as AI, they'd say, of course it's AI! But now that we know how to do it, now it's no longer AI. Now it's just search. (Philosophers have a similar complaint: as soon as a branch of philosophy leads to anything concrete, it's no longer called philosophy! It's called math or science.)
There's another thing we appreciate now that people in Turing's time didn't really appreciate. This is that, in trying to write programs to simulate human intelligence, we're competing against a billion years of evolution. And that's damn hard. One counterintuitive consequence is that it's much easier to program a computer to beat Gary Kasparov at chess, than to program a computer to recognize faces under varied lighting conditions. Often the hardest tasks for AI are the ones that are trivial for a 5-year-old -- since those are the ones that are so hardwired by evolution that we don't even think about them.
In the last fifty years, have there been any new insights about the Turing Test itself? In my opinion, no. There has, on the other hand, been a non-insight, which is called Searle's Chinese Room. This is supposed to be an argument that even a computer that did pass the Turing Test wouldn't be intelligent. The way it goes is, let's say you don't speak Chinese. (Debbie isn't here today, so I think that's a safe assumption.) You sit in a room, and someone passes you paper slips through a hole in the wall with questions written in Chinese, and you're able to answer the questions (again in Chinese) just by consulting a rule book. In this case, you might be carrying out an intelligent Chinese conversation, yet by assumption, you don't understand a word of Chinese! Therefore symbol-manipulation can't produce understanding.
So, class, how might a strong AI proponent respond to this argument? Duh: you might not understand Chinese, but the rule book does! Or if you like, understanding Chinese is an emergent property of the system consisting of you and the rule book, in the same sense that understanding English is an emergent property of the neurons in your brain. Like many other thought experiments, the Chinese Room gets its mileage from a deceptive choice of imagery -- and more to the point, from ignoring computational complexity. We're invited to imagine someone pushing around slips of paper with zero understanding or insight -- much like the doofus freshmen who write (a+b)2=a2+b2 on their math tests. But how many slips of paper are we talking about? How big would the rule book have to be, and how quickly would you have to consult it, to carry out an intelligent Chinese conversation in anything resembling real time? If each page of the rule book corresponded to one neuron of (say) Debbie's brain, then probably we'd be talking about a "rule book" at least the size of the Earth, its pages searchable by a swarm of robots traveling at close to the speed of light. When you put it that way, maybe it's not so hard to imagine that this enormous Chinese-speaking entity -- this dian nao -- that we've brought into being might have something we'd be prepared to call understanding or insight.
Of course, everyone who talks about this stuff is really tiptoeing around the hoary question of consciousness. See, consciousness has this weird dual property, that on the one hand, it's arguably the most mysterious thing we know about, and the other hand, not only are we directly aware of it, but in some sense it's the only thing we're directly aware of. You know, cogito ergo sum and all that. So to give an example, I might be mistaken about Richard's shirt being blue -- I might be hallucinating or whatever -- but I really can't be mistaken about my perceiving it as blue. (Or if I could, then we get an infinite regress.)
Question from the floor: What about optical illusions? You might know that a dot isn't moving, yet still perceive it as moving...
Response: There's no contradiction between the dot not moving (and my knowing that it isn't moving), and my being certain that I perceive it as moving. What I perceive and what's out there are two different things (even if I know they're different).
Now, is there anything else that also produces the feeling of absolute certainty? Right -- math! Incidentally, I think this similarity between math and subjective experience might go a long away toward explaining mathematicians' "quasi-mystical" tendencies. (I can already hear Greg Kuperberg wincing. Wince, Greg, wince!) This is a good thing for physicists to understand: when you're talking to a mathematician, you might not be talking to someone who fears the real world and who's therefore retreated into intellectual masturbation. You might be talking to someone for whom the real world was never especially real to begin with! I mean, to come back to something we mentioned earlier: why did many mathematicians look askance at the computer proof of the Four-Color Theorem? Sure, the computer might have made a mistake, but humans make plenty of mistakes too!
What it boils down to, I think, is that there is a sense in which the Four-Color Theorem has been proved, and there's another sense in which many mathematicians understand proof, and those two senses aren't the same. For many mathematicians, a statement isn't proved when a physical process (which might be a classical computation, a quantum computation, an interactive protocol, or something else) terminates saying that it's been proved -- however good the reasons might be to believe that physical process is reliable. Rather, the statement is proved when they (the mathematicians) feel that their minds can directly perceive its truth.
Of course, it's hard to discuss these things directly. But what I'm trying to point out is that many people's "anti-robot animus" is probably a combination of two ingredients:
1. the directly-experienced certainty that they're conscious -- that they perceive colors, sounds, positive integers, etc., regardless of whether anyone else does, and
2. the belief that if they were just a computation, then they could not be conscious in this way.
For example, I think Penrose's objections to strong AI derive from these two ingredients. I think his arguments about Gödel's Theorem are window dressing added later.
For people who think this way (as even I do, at least in certain moods), granting consciousness to a robot seems strangely equivalent to denying that one is conscious oneself. Is there any respectable way out this dilemma -- or in other words, any way out that doesn't rely on a meatist double standard, with one rule for ourselves and a different rule for robots?
My own favorite way out is one that's been advocated by the philosopher David Chalmers. Basically, what Chalmers proposes is a "philosophical NP-completeness reduction": a reduction of one mystery to another. He says that if computers someday pass the Turing Test, then we'll be compelled to regard them as conscious. And as for how they could be conscious, we'll understand that just as well and as poorly as we understand how a bundle of neurons could be conscious. Yes, it's mysterious, but the one mystery doesn't seem so different from the other.
Today's Puzzles
* [The well-defined puzzle] Can we assume without loss of generality that a computer program has access to its own source code?
* [The vague, ill-defined puzzle] If that which before the 1800's was called water turned out to be CH4 instead of H2O, would it still be water, or would it be something else?
Answers to Homework
Recall that BB(n), or the "nth Busy Beaver number," is the largest number of steps that an n-state Turing machine can make on an initially blank tape before halting.
The first problem was to prove that BB(n) grows faster than any computable function. Did people get this one? Excellent!
Yeah, suppose there were a computable function f(n) such that f(n)>BB(n) for every n. Then given an n-state Turing machine M, we could first compute f(n), then simulate M for up to f(n) steps. If M hasn't halted by then, then we know it never will halt, since f(n) is greater than the maximum number of steps any n-state machine could make. But this gives us a way to solve the halting problem, which we already know is impossible. Therefore the function f doesn't exist.
So the BB(n) function grows really, really, really fast. (In case you're curious, here are the first few values, insofar as they've been computed by people with too much free time: BB(1)=1, BB(2)=6, BB(3)=21, BB(4)=107, BB(5)>=47,176,870.)
The second problem was whether S = 1/BB(1) + 1/BB(2) + 1/BB(3) + ... is a computable real number. In other words, is there an algorithm that given an integer k, outputs a rational number S' such that |S-S'| < 1/k?
People had more trouble with this one? Alright, let's see the answer. The answer is no -- it isn't computable. For suppose it were computable; then we'll give an algorithm to compute BB(n) itself, which we know is impossible.
Assume by induction that we've already computed BB(1), BB(2), ..., BB(n-1). Then consider the sum of the "higher-order terms":
Sn = 1/BB(n) + 1/BB(n+1) + ...
If S is computable, then Sn must be computable as well. But this means we can approximate Sn within 1/2, 1/4, 1/8, and so on, until the interval that we've bounded Sn in no longer contains 0. When that happens, we get an upper bound on 1/Sn. But since 1/BB(n+1), 1/BB(n+2), and so on are so much smaller than 1/BB(n), any upper bound on 1/Sn immediately yields an upper bound on BB(n) as well. But once we have an upper bound on BB(n), we can then compute BB(n) itself, by simply simulating all n-state Turing machines. So assuming we could compute S, we could also compute BB(n), which we already know is impossible. Therefore S is not computable.
| 1.5 | | PaleocomplexityBy any objective standard, the theory of computational complexity ranks as one of the greatest intellectual achievements of humankind -- along with fire, the wheel, and computability theory. That it isn't taught in high schools is really just an accident of history. In any case, we'll certainly need complexity theory for everything else we're going to do in this course, which is why the next five or six lectures will be devoted to it. So before we dive in, let's step back and pontificate about where we're going.
What I've been trying to do is show you the conceptual underpinnings of the universe, before quantum mechanics comes on the scene. The amazing thing about quantum mechanics is that, despite being a grubby empirical discovery, it changes some of the underpinnings! Others it doesn't change, and others it's not so clear whether it changes them or not. But if we want to debate how things are changed by quantum mechanics, then we'd better understand what they looked like before quantum mechanics.
It's useful to divide complexity theory into historical epochs:
* 1950's: Late Turingzoic
* 1960's: Dawn of the Asymptotic Age
* 1971: The Cook-Levin Asteroid; extinction of the Diagonalosaurs
* Early 1970's: The Karpian Explosion
* 1978: Early Cryptozoic
* 1980's: Randomaceous Era
* 1993: Eruption of Mt. Razborudich; extinction of the Combinataurs
* 1994: Invasion of the Quantodactyls
* Mid-1990's to present: Derandomaceous Era
This lecture will be about "paleocomplexity": complexity in the age before P, NP, and NP-completeness, when Diagonalosaurs ruled the earth. Then Lecture 6 will cover the Karpian Explosion, Lecture 7 the Randomaceous Era, Lecture 8 the Early Cryptozoic, and Lecture 9 the Invasion of the Quantodactyls.
We talked on Thursday about computability theory. We saw how certain problems are uncomputable -- like, given a statement about positive integers, is it true or false? (If we could solve that, then we could solve the halting problem, which we already know is impossible.)
But now let's suppose we're given a statement about real numbers -- for example,
For all real x,y, (x+y)2=x2+2xy+y2
-- and we want to know if it's true or false. In this case, it turns out that there is a decision procedure -- this was proved by Tarski in the 1930's, at least when the statement only involves addition, multiplication, comparisons, the constants 0 and 1, and universal and existential quantifiers (no exponentials or trig functions).
Intuitively, if all our variables range over real numbers instead of integers, then everything is forced to be smooth and continuous, and there's no way to build up Gödel sentences like "this sentence can't be proved."
(If we throw in the exponential function, then apparently there's still no way to encode Gödel sentences, modulo an unsolved problem in analysis. But if we throw in the exponential function and switch from real numbers to complex numbers, then we're again able to encode Gödel sentences -- and the theory goes back to being undecidable! Can you guess why? Well, once we have complex numbers, we can force a number n to be an integer, by saying that we want e2πin to equal 1. So we're then back to where we were with integers.)
Anyway, the attitude back then was, OK, we found an algorithm to decide the truth or falsehood of any sentence about real numbers! We can go home! Problem solved!
Trouble is, if you worked out how many steps that algorithm took to decide the truth of a sentence with n symbols, it grew like an enormous stack of exponentials: 2^2^...2^n. So I was reading in a biography of Tarski that, when actual computers came on the scene in the 1950's, one of the first things anyone thought to do was to implement Tarski's algorithm for deciding statements about the real numbers. And it was hopeless -- indeed, it would've been hopeless even on the computers of today! On the computers of the 1950's, it was hopelesshopeless^...^hopeless.
So, these days we talk about complexity. (Or at least most of us do.) The idea is, you impose an upper bound on how much of some resource your computer can use. The most obvious resources are (1) amount of time and (2) amount of memory, but many others can be defined. (Indeed, if you visit the Complexity Zoo, you'll find several hundred of them.)
One of the very first insights is, if you ask how much can be computed in 10 million steps, or 20 billion bits of memory, you won't get anywhere. Your theory of computing will be at the mercy of arbitrary choices about the underlying model. In other words, you won't be doing theoretical computer science at all: you'll be doing architecture, which is an endlessly-fascinating, non-dreary, non-boring topic in its own right, but not our topic.
So instead you have to ask a looser question: how much can be computed in an amount of time that grows linearly (or quadratically, or logarithmically) with the problem size? Asking this sort of question lets you ignore constant factors.
So, we define TIME(f(n)) to be the class of problems for which every instance of size n is solvable (by some "reference" computer) in an amount of time that grows like a constant times f(n). Likewise, SPACE(f(n)) is the class of problems solvable using an amount of space (i.e., bits of memory) that grows like a constant times f(n).
What can we say? Well, for every function f(n), TIME(f(n)) is contained in SPACE(f(n)). Why? Because a Turing machine can access at most one memory location per time step.
What else? Presumably you agree that TIME(n2) is contained in TIME(n3). Here's a question: is it strictly contained? In other words, can you solve more problems in n3 time than in n2 time? (Here the choice of the exponents 3 and 2 is obviously essential. Asking whether you can solve more problems in n4 time than n3 time would just be ridiculous!)
Seriously, it turns out that you can solve more problems in n3 time than in n2 time. This is a consequence of a fundamental result called the Time Hierarchy Theorem, which was proven by Hartmanis and Stearns in the mid-1960's and later rewarded with a Turing Award. (Not to diminish their contribution, but back then Turing Awards were hanging pretty low on the tree! Of course you had to know to be looking for them, which not many people did.)
Let's see how the proof goes. We need to find a problem that's solvable in n3 time but not n2 time. What will this problem be? It'll be the simplest thing you could imagine: a time-bounded analogue of Turing's halting problem.
Given a Turing machine M, does M halt in at most n2.5 steps? (Here n2.5 is just some function between n2 and n3.)
Clearly we can solve the above problem in n3 steps, by simulating M for n2.5 steps and seeing whether it halts or not. (Indeed, we can solve the problem in something like n2.5 log n steps. We always need some overhead when running a simulation, but the overhead can be made extremely small.)
But now suppose there were a program P to solve the problem in n2 steps. We'll derive a contradiction. By using P as a subroutine, clearly we could produce a new program P' with the following behavior. Given a program M as input, P'
1. runs forever if M halts in at most n2.5 steps given its own code as input, or
2. halts in n2.5 steps if M runs for more than n2.5 steps given its own code as input.
Furthermore, P' does all of this in at most n2.5 steps (indeed, n2 steps plus some overhead).
Now what do we do? Duh, we feed P' its own code as input! This gives us a contradiction, which implies that P can never have existed in the first place.
Obviously I was joking when I said the choice of n3 versus n2 was essential. We can substitute n17 versus n16, 3n versus 2n, etc. But there's actually an interesting question here: can we substitute any functions f and g such that f grows significantly faster than g? The surprising answer is no! The function g needs a property called time-constructibility, which means (basically) that there's some program that halts in g(n) steps given n as input. Without this property, the program P' wouldn't know how many steps to simulate M for, and the argument wouldn't go through.
Now, every function you'll ever encounter in civilian life will be time-constructible. But in the early 1970's, complexity theorists made up some bizarre, rapidly-growing functions that aren't. And for these functions, you really can get arbitrarily large gaps in the complexity hierarchy! So for example, there's a function f such that TIME(f(n))=TIME(2f(n)). (Duuuuude. To those who doubt that complexity is better than cannabis, I rest my case.)
Anyway, completely analogous to the Time Hierarchy Theorem is the Space Hierarchy Theorem, which says there's a problem solvable with n3 bits of memory that's not solvable with n2 bits of memory.
Alright, next question: in computer science, we're usually interested in the fastest algorithm to solve a given problem. But is it clear that every problem has a fastest algorithm? Or could there be a problem that admits an infinite sequence of algorithms, with each one faster than the last but slower than some other algorithm?
Contrary to what you might think, this is not just a theoretical armchair question: it's a concrete, down-to-earth armchair question! As an example, consider the problem of multiplying two n-by-n matrices. The obvious algorithm takes O(n3) time. In 1968 Strassen gave a more complicated algorithm that takes O(n2.78) time. Improvements followed, culminating in an O(n2.376) algorithm of Coppersmith and Winograd. But is that the end of the line? Might there be an algorithm to multiply matrices in n2 time? Here's a weirder possibility: could it be that for every ε>0, there exists an algorithm to multiply n-by-n matrices in time O(n2+ε), but as ε approaches 0, these algorithms become more and more complicated without end?
See, some of this paleocomplexity stuff is actually nontrivial! (T-Rex might've been a dinosaur, but it still had pretty sharp teeth!) In this case, a 1967 result called the Blum Speedup Theorem says that there really are problems that admit no fastest algorithm. Not only that: there exists a problem P such that for every function f, if P has an O(f(n)) algorithm then it also has an O(log f(n)) algorithm!
Ray Laflamme: I wouldn't know how to begin to prove that...
Neither would I! So let's see how it goes. Let t(n) be a complexity bound. Our goal is to define a function f, from integers to {0,1}, such that if f can be computed in O(t(n)) steps, then it can also be computed in O(t(n-i)) steps for any positive integer i. Taking t to be sufficiently large then gives us as dramatic a speedup as we want: for example, if we set t(n):=2t(n-1), then certainly t(n-1)=O(log t(n)).
Let M1,M2,... be an enumeration of Turing machines. Then let Si = {M1,...,Mi} be the set consisting of the first i machines. Here's what we do: given an integer n as input, we loop over all i from 1 to n. In the ith iteration, we simulate every machine in Si that wasn't "cancelled" in iterations 1 to i-1. Let Mj be the first such machine that halts in at most t(n-i) steps (or if none of them do, then choose Mj arbitrarily). Then we define f(i) to be 1 if Mj outputs 0, or 0 if Mj outputs 1. (In other words, we cause Mj to fail at computing f(i).) We also "cancel" Mj, meaning that Mj doesn't need to be simulated in any later iteration. This defines the function f.
Certainly f(n) can be computed in O(n2 t(n)) steps, by simply simulating the entire iterative procedure above. The key observation is this: for any integer i, if we hardwire the outcomes of iterations 1 to i into our simulation algorithm (i.e. tell the algorithm which Mj's get cancelled in those iterations), then we can skip iterations 1 to i, and proceed immediately to iteration i+1. Furthermore, assuming we start from iteration i+1, we can compute f(n) in only O(n2 t(n-i)) steps, instead of O(n2 t(n)) steps. So the more information we "precompute," the faster the algorithm will run on sufficiently large inputs n.
To turn this idea into a proof, the main thing one needs to show is that simulating the iterative procedure is pretty much the only way to compute f: or more precisely, any algorithm to compute f needs at least t(n-i) steps for some i. This then implies that f has no fastest algorithm.
Ray: I get it...
Others: Huh?
Puzzle 1 From Last Week
Can we assume, without loss of generality, that a computer program has access to its own code? As a simple example, is there a program that prints itself as output?
The answer is yes: there are such programs. In fact, there have even been competitions to write the shortest self-printing program. At the IOCCC (the International Obfuscated C Code Contest), this competition was won some years ago by an extremely short program. Can you guess how long it was: 30 characters? 10? 5?
The winning program had zero characters. (Think about it!) Admittedly, a blank file is not exactly a kosher C program, but apparently some compilers will compile it to a program that does nothing.
Alright, alright, but what if we want a nontrivial self-printing program? In that case, the standard trick is to do something like the following (which you can translate into your favorite programming language):
Print the following twice, the second time in quotes.
"Print the following twice, the second time in quotes."
In general, if you want a program to have access to its own source code, the trick is to divide the program into three parts: (1) a part that actually does something useful (this is optional), (2) a "replicator," and (3) a string to be replicated. The string to be replicated should consist of the complete code of the program, including the replicator. (In other words, it should consist of parts (1) and (2).) Then by running the replicator twice, we get a spanking-new copy of parts (1), (2), and (3).
This idea was elaborated by von Neumann in the early 1950's. Shortly afterward, two guys (I think their names were Crick and Watson) found a physical system that actually obeys these rules. You and I, along with all living things on Earth, are basically walking computer programs with the semantics
Build a baby that acts on the following instructions, and also contains a copy of those instructions in its reproductive organs.
"Build a baby that acts on the following instructions, and also contains a copy of those instructions in its reproductive organs."
Puzzle 2 From Last Week
If water weren't H2O, would it still be water?
Yeah, this isn't really a well-defined question: it all boils down to what we mean by the word water. Is water a "predicate": if x is clear and wet and drinkable and tasteless and freezable to ice, etc. ... then x is water? On this view, what water "is" is determined by sitting in our armchairs and listing necessary and sufficient conditions for something to be water. We then venture out into the world, and anything that meets the conditions is water by definition. This was the view of Frege and Russell, and it implies that anything with the "intuitive properties" of water is water, whether or not it's H2O.
The other view, famously associated with Saul Kripke, is that the word water "rigidly designates" a particular substance (H2O). On this view, we now know that when the Greeks and Babylonians talked about water, they were really talking about H2O, even though they didn't realize it. Interestingly, "water = H2O" is thus a necessary truth that was discovered by empirical observation. Something with the same properties as water but a different chemical structure would not be water.
Kripke argues that, if you accept this "rigid designator" view, then there's an implication for the mind-body problem.
The idea is this: the reductionist dream would be to explain consciousness in terms of neural firings, in the same way that science explained water as being H2O. But Kripke says there's a disanalogy between these two cases. In the case of water, we can at least talk coherently about a hypothetical substance that feels like water, tastes like water, etc., but isn't H2O and therefore isn't water. But suppose we discovered that pain is always associated with the firings of certain nerves called C-fibers. Could we then say that pain is C-fiber firings? Well, if something felt like pain but had a different neurobiological origin, would we say that it felt like pain but wasn't pain? Presumably we wouldn't. Anything that feels like pain is pain, by definition! Because of this difference, Kripke thinks that we can't explain pain as "being" C-fiber firings, in the same sense that we can explain water as "being" H2O.
Some of you look bored. Dude -- this is considered one of the greatest philosophical insights of the last four decades! I'm serious! Well, I guess if you don't find it interesting, philosophy is not the field for you. | 1.5 | | P, NP, and FriendsWe've seen that if we want to make progress in complexity, then we need to talk about asymptotics: not which problems can be solved in 10,000 steps, but for which problems can instances of size n be solved in cn2 steps as n goes to infinity? We met TIME(f(n)), the class of all problems solvable in O(f(n)) steps, and SPACE(f(n)), the class of all problems solvable using O(f(n)) bits of memory.
But if we really want to make progress, then it's useful to take an even coarser-grained view: one where we distinguish between polynomial and exponential time, but not between O(n2) and O(n3) time. From this remove, we think of any polynomial bound as "fast," and any exponential bound as "slow."
Now, I realize people will immediately object: what if a problem is solvable in polynomial time, but the polynomial is n50000? Or what if a problem takes exponential time, but the exponential is 1.00000001n? My answer is pragmatic: if cases like that regularly arose in practice, then it would've turned out that we were using the wrong abstraction. But so far, it seems like we're using the right abstraction. Of the big problems solvable in polynomial time -- matching, linear programming, primality testing, etc. -- most of them really do have practical algorithms. And of the big problems that we think take exponential time -- theorem-proving, circuit minimization, etc. -- most of them really don't have practical algorithms. So, that's the empirical skeleton holding up our fat and muscle.
Petting Zoo
It's now time to meet the most basic complexity classes -- the sheep and goats of the Complexity Zoo.
* P is the class of problems solvable by a Turing machine in polynomial time. In other words, P is the union, over all positive integers k, of TIME(nk). (Note that by "problem," we'll always mean decision problem: a problem where the inputs are n-bit strings and the outputs are either yes or no.)
* PSPACE is the class of problems solvable in polynomial space (but unlimited time). In other words, it's the union over all integers k of SPACE(nk).
* EXP is the class of problems solvable in exponential time. In other words, it's the union over all integers k of TIME(2n^k).
Certainly P is contained in PSPACE. I claim that PSPACE is contained in EXP. Why?
Right: a machine with nk bits of memory can only go through 2n^k different configurations, before it either halts or else gets stuck in an infinite loop.
Now, NP is the class of problems for which, if the answer is yes, then there's a polynomial-size proof of that fact that you can check in polynomial time. (The NP stands for "Nondeterministic Polynomial," in case you were wondering.) I could get more technical, but it's easiest to give an example: say I give you a 10,000-digit number, and I ask whether it has a divisor ending in 3. Well, answering that question might take a Long, Long TimeTM. But if your grad student finds such a divisor for you, then you can easily check that it works: you don't need to trust your student (always a plus).
I claim that NP is contained in PSPACE. Why?
Right: in polynomial space, you can loop over all possible nk-bit proofs and check them one by one. If the answer is "yes," then one of the proofs will work, while if the answer is "no," then none of them will work.
Certainly P is contained in NP: if you can answer a question yourself, then someone else can convince you that the answer is yes (if it is yes) without even telling you anything.
One interesting little puzzle, which I'll leave as a homework exercise for Tuesday, is whether P equals NP. In other words, if you can recognize an answer efficiently, can you also find one efficiently?
Oh, alright, you don't have to solve this one by Tuesday. You can have till Thursday.
Look, this question of whether P=NP, what can I say? People like to describe it as "probably the central unsolved problem of theoretical computer science." That's a comical understatement. P vs. NP is one of the deepest questions that human beings have ever asked.
And not only that: it's one of the seven million-dollar prize problems of the Clay Math Institute! What an honor! Imagine: our mathematician friends have decided that P vs. NP is as important as the Hodge Conjecture, or even Navier-Stokes existence and smoothness! (Apparently they weren't going to include it, until they asked around to make sure it was important enough.)
Dude. One way to measure P vs. NP's importance is this. If NP problems were feasible, then mathematical creativity could be automated. The ability to check a proof would entail the ability to find one. Every Apple II, every Commodore, would have the reasoning power of Archimedes or Gauss. So by just programming your computer and letting it run, presumably you could immediately solve not only P vs. NP, but also the other six Clay problems. (Or five, now that Poincaré is down.)
But if that's the case, then why isn't it obvious that P doesn't equal NP? Surely God wouldn't be so benign as to grant us these extravagant powers! Surely our physicist-intuition tells us that brute-force search is unavoidable! (Leonid Levin told me that Feynman -- the king (or maybe court jester) of physicist-intuition -- had trouble even being convinced that P vs. NP was an open problem!)
Well, we certainly believe P≠NP. Indeed, we don't even believe there's a general way to solve NP problems that's dramatically better than brute-force search through every possibility. But if you want to understand why it's so hard to prove these things, let me tell you something.
Let's say you're given an N-digit number, but instead of factoring it, you just want to know if it's prime or composite.
Or let's say you're given a list of freshmen, together with which ones are willing to room with which other ones, and you want to pair off as many willing roommates as you can.
Or let's say you're given two DNA sequences, and you want to know how many insertions and deletions are needed to convert the one sequence to the other.
Surely these are fine examples of the sort of exponentially-hard NP problem we were talking about! Surely they, too, require brute-force search!
Except they don't. As it turns out, all of these problems have clever polynomial-time algorithms. The central challenge any P≠NP proof will have to overcome is to separate the NP problems that really are hard from the ones that merely look hard. I'm not just making a philosophical point. While there have been dozens of purported P≠NP proofs over the years, almost all of them could be rejected immediately for the simple reason that, if they worked, then they would rule out polynomial-time algorithms that we already know to exist.
So to summarize, there are problems like primality testing and pairing off roommates, for which computer scientists (often after decades of work) have been able to devise polynomial-time algorithms. But then there are other problems, like proving theorems, for which we don't know of any algorithm fundamentally better than brute-force search. But is that all we can say -- that we have a bunch of these NP problems, and some of them seem easy and others seem hard?
As it turns out, we can say something much more interesting than that. We can say that almost all of the "hard" problems are the same "hard" problem in different guises -- in the sense that, if we had a polynomial-time algorithm for any one of them, then we'd also have polynomial-time algorithms for all the rest. This is the upshot of the theory of NP-completeness, which was created in the early 1970's by Cook, Karp, and Levin.
The way it goes is, we define a problem B to be "NP-hard" if any NP problem can be efficiently reduced to B. What the hell does that mean? It means that, if we had an oracle to immediately solve problem B, then we could solve any NP problem in polynomial time.
That gives one notion of reduction, which is called Cook reduction. There's also a weaker notion of reduction, which is called Karp reduction. In a Karp reduction from problem A to problem B, we insist that there should be a polynomial-time algorithm that transforms any instance of A to an instance of B having the same answer.
What's the difference between Cook and Karp?
Right: with a Cook reduction, in solving problem A we get to call the oracle for problem B more than once. We can even call the oracle adaptively -- that is, in ways that depend on the outcomes of the previous calls. A Karp reduction is weaker in that we don't allow ourselves these liberties. Perhaps surprisingly, almost every reduction we know of is a Karp reduction -- the full power of Cook reductions is rarely needed in practice.
Now, we say a problem is NP-complete if it's both NP-hard and in NP. In other words, NP-complete problems are the "hardest" problems in NP: the problems that single-handedly capture the difficulty of every other NP problem. As a first question, is it obvious that NP-complete problems even exist?
I claim that it is obvious. Why?
Well, consider the following problem, called DUH: we're given a polynomial-time Turing machine M, and we want to know if there exists an nk-bit input string that causes M to accept. I claim that any instance of any NP problem can be converted, in polynomial time, into a DUH instance having the same answer. Why? Well, DUH! Because that's what it means for a problem to be in NP!
The discovery of Cook, Karp, and Levin was not that there exist NP-complete problems -- that's obvious -- but rather that many natural problems are NP-complete.
The king of these natural NP-complete problems is called 3-Satisfiability, or 3SAT. (How do I know it's the king? Because it appeared on the TV show "NUMB3RS.") Here we're given n Boolean variables, x1,...,xn, as well as a set of logical constraints called clauses that relate at most three variables each:
x2 or x5 or not(x6)
not(x2) or x4
not(x4) or not(x5) or x6
...
Then the question is whether there's some way to set the variables x1,...,xn to TRUE or FALSE, in such a way that every clause is "satisfied" (that is, every clause evaluates to TRUE).
It's obvious that 3SAT is in NP. Why? Right: Because if someone gives you a setting of x1,...,xn that works, it's easy to check that it works!
Our goal is to prove that 3SAT is NP-complete. What will that take? Well, we need to show that if we had an oracle for 3SAT, then we could use it to solve not only 3SAT in polynomial time, but any NP problem whatsoever. That seems like a tall order! Yet in retrospect, you'll see that it's almost a triviality.
The proof has two steps. Step 1 is to show that, if we could solve 3SAT, then we could solve a more "general" problem called CircuitSAT. Step 2 is to show that, if we could solve CircuitSAT, then we could solve any NP problem.
In CircuitSAT, we're given a Boolean circuit and ... wait, do we have any engineers in this class? We do? Then listen up: in computer science, a "circuit" never has loops! Nor does it have resistors or diodes or anything weird like that. For us, a circuit is just an object where you start with n Boolean variables x1,...,xn, and then you can repeatedly define a new variable that's equal to the AND, OR, or NOT of variables that you've previously defined. Like so:
xn+1 := x3 or xn
xn+2 := not(xn+1)
xn+3 := x1 and xn+2
...
We designate the last variable in the list as the circuit's "output." Then the goal, in CircuitSAT, is to decide whether there's a setting of x1,...,xn such that the output is TRUE.
I claim that, if we could solve 3SAT, then we could also solve CircuitSAT. Why?
Well, all we need to do is notice that every CircuitSAT instance is really a 3SAT instance in disguise! Every time we compute an AND, OR, or NOT, we're relating one new variable to one or two old variables. And any such relationship can be expressed by a set of clauses involving at most three variables each. So for example,
xn+1 := x3 or xn
becomes
xn+1 or not(x3)
xn+1 or not(xn)
not(xn+1) or x3 or xn.
So, that was Step 1. Step 2 is to show that, if we can solve CircuitSAT, then we can solve any NP problem.
Alright, so consider some instance of an NP problem. Then by the definition of NP, there's a polynomial-time Turing machine M such that the answer is "yes" if and only if there's a polynomial-size witness string w that causes M to accept.
Now, given this Turing machine M, our goal is to create a circuit that "mimics" M. In other words, we want there to exist a setting of the circuit's input variables that makes it evaluate to TRUE, if and only if there exists a string w that causes M to accept.
How do we achieve that? Simple: by defining a whole shitload of variables! We'll have a variable that equals TRUE if and only if the 37th bit of M's tape is set to '1' at the 42nd time step. We'll have another variable that equals TRUE if and only if the 14th bit is set to '1' at the 52nd time step. We'll have another variable that equals TRUE if and only if M's tape head is in the 15th internal state and the 74th tape position at the 33rd time step. Well, you get the idea.
Then, having written down this shitload of variables, we write down a shitload of logical relations between them. If the 17th bit of the tape is '0' at the 22nd time step, and the tape head is nowhere near the 17th bit at that time, then the 17th bit will still be '0' at the 23rd time step. If the tape head is in internal state 5 at the 44th time step, and it's reading a '1' at that time step, and internal state 5 transitions to internal state 7 on reading a '1', then the tape head will be in internal state 7 at the 45th time step. And so on, and so on. The only variables that are left unrestricted are the ones that constitute the string w at the first time step.
The key point is that, while this is a very large shitload of variables and relations, it's still only a polynomial shitload. We therefore get a polynomially-large CircuitSAT instance, which is satisfiable if and only if there exists a w that causes M to accept.
[Devin, our resident engineer, is laughing hysterically]
Scott: What?
Devin: Nothing! That is just so simple in the basic idea, and so messy in the details...
Scott: Exactly. You've understood completely.
We've just proved the celebrated Cook-Levin Theorem: that 3SAT is NP-complete. This theorem can be thought of as the "initial infection" of the NP-completeness virus. Since then, the virus has spread to thousands of other problems. What I mean is this: if you want to prove that your favorite problem is NP-complete, all you have to do is prove it's as hard as some other problem that's already been proved NP-complete. (Well, you also have to prove that it's in NP, but that's usually trivial.) So there's a rich-get-richer effect: the more problems that have already been proved NP-complete, the easier it is to induct a new problem into the club. Indeed, proving problems NP-complete had become so routine by the 80's or 90's, and people had gotten so good at it, that (with rare exceptions) STOC and FOCS stopped publishing yet more NP-completeness proofs.
I'll just give you a tiny sampling of some of the earliest problems that were proved NP-complete:
* Map Colorability: Given a map, can you color every country red, green, or blue, in such a way that no two neighboring countries are colored the same? (Interestingly, if you're only allowed to use two colors, then it's easy to decide whether or not such a coloring is possible -- why? On the other hand, if you're allowed four colors, then it always is possible, at least for maps drawn in the plane -- that's the famous Four-Color Theorem. So then the problem is again easy. Only with three colors is the problem NP-complete.)
* Clique: Given a set of N high-school students, together with which ones will sit at a cafeteria table with which other ones, is there a "clique" of N/3 students who will all sit at a table with each other?
* Packing: Given a set of boxes of specified dimensions, can you fit them into the trunk of your car?
Etc., etc., etc.
To reiterate: although these problems might look unrelated, they're actually the same problem in different costumes. if any one of them has an efficient solution, then all of them do, and P=NP. If any one of them doesn't have an efficient solution, then none of them do, and P≠NP. To prove P=NP, it's enough to show that some NP-complete problem (no matter which one) has an efficient solution. To prove P≠NP, it's enough to show that some NP-complete problem has no efficient solution. One for all and all for one.
So, there are the P problems, and then there are the NP-complete problems. Is there anything in between? (You should be used to this sort of "intermediate" question by now -- we saw it both in set theory and in computability theory!)
If P=NP, then NP-complete problems are P problems, so obviously the answer is no.
But what if P≠NP? In that case, a beautiful result called Ladner's Theorem says that there must be "intermediate" problems between P and NP-complete: in other words, problems that are in NP, but neither NP-complete nor solvable in polynomial time.
How would we create such an intermediate problem? Well, I'll give you the idea. The first step is to define an extremely slow-growing function t. Then given a 3SAT instance F of size n, the problem will be to decide whether F is satisfiable and t(n) is odd. In other words: if t(n) is odd then solve the 3SAT problem, while if t(n) is even then always output "no."
If you think about what we're doing, we're alternating long stretches of an NP-complete problem with long stretches of nothing! Intuitively, each stretch of 3SAT should kill off another polynomial-time algorithm for our problem, where we use the assumption that P≠NP. Likewise, each stretch of nothing should kill off another NP-completeness reduction, where we again use the assumption that P≠NP. This ensures that the problem is neither in P nor NP-complete. The main technical trick is to make the stretches get longer at an exponential rate. That way, given an input of size n, we can simulate the whole iterative process up to n in time polynomial in n. That ensures that the problem is still in NP.
Besides P and NP, another major complexity class is coNP: the "complement" of NP. A problem is in coNP if a "no" answer can be checked in polynomial-time. To every NP-complete problem, there's a corresponding coNP-complete problem. We've got unsatisfiability, map non-colorability, etc.
Now, why would anyone bother to define such a stupid thing? Because then we can ask a new question: does NP equal coNP? In other words: if a Boolean formula is unsatisfiable, is there at least a short proof that it's unsatisfiable, even if finding the proof would take exponential time? Once again, the answer is that we don't know.
Certainly if P=NP then NP=coNP. (Why?) On the other hand, the other direction isn't known: it could be that P≠NP but still NP=coNP. So if proving P≠NP is too easy, you can instead try to prove NP≠coNP!
This seems like a good time to mention a special complexity class, a class we quantum computing people know and love: NP intersect coNP.
This is the class for which either a yes answer or a no answer has an efficiently checkable proof. As an example, consider the problem of factoring an integer into primes. Over the course of my life, I must've met at least two dozen people who "knew" that factoring is NP-complete, and therefore that Shor's algorithm -- since it lets us factor on a quantum computer -- also lets us solve NP-complete problems on a quantum computer. Often these people were supremely confident of their "knowledge."
But if you think about it for two seconds, you'll realize that factoring has profound differences from the known NP-complete problems. If I give you a Boolean formula, it might have zero satisfying truth assignments, it might have one, or it might have 10 trillion. You simply don't know, a priori. But if I give you a 5000-digit integer, you probably won't know its factorization into primes, but you'll know that has one and only one factorization. (I think some guy named Euclid proved that a while ago.) This already tells us that factoring is somehow "special": that, unlike what we believe about the NP-complete problems, factoring has some structure that algorithms could try to exploit. And indeed, algorithms do exploit it: we know of a classical algorithm called the Number Field Sieve, which factors an n-bit integer in roughly 2n^1/3 steps, compared to the ~2n/2 steps that would be needed for trying all possible divisors. (Why only ~2n/2 steps, instead of ~2n?) And of course, we know of Shor's algorithm, which factors an n-bit integer in ~n2 steps on a quantum computer: that is, in quantum polynomial time. Contrary to popular belief, we don't know of a quantum algorithm to solve NP-complete problems in polynomial time. If such an algorithm existed, it would have to be dramatically different from Shor's algorithm.
But can we pinpoint just how factoring differs from the known NP-complete problems, in terms of complexity theory? Yes, we can. First of all, in order to make factoring a decision (yes-or-no) problem, we need to ask something like this: given a positive integer N, does N have a prime factor whose last digit is 7? I claim that this problem is not merely in NP, but in NP intersect coNP. Why? Well, suppose someone gives you the prime factorization of N. There's only one of them. So if there is a prime factor whose last digit is 7, then you can verify that, and if there's no prime factor whose last digit is 7, then you can also verify that.
You might say, "but how do I know that I really was given the prime factorization? Sure, if someone gives me a bunch of numbers, I can check that they multiply to N, but how do I know they're prime?" For this you'll have to take on faith something that I told you earlier: that if you just want to know whether a number is prime or composite, and not what its factors are, then you can do that in polynomial time. OK, so if you accept that, then the factoring problem is in NP intersect coNP.
From this we can conclude that, if factoring were NP-complete, then NP would equal coNP. (Why?) Since we don't believe NP=coNP, this gives us a strong indication (though not a proof) that, all those people I told you about notwithstanding, factoring is not NP-complete. If we accept that, then only two possibilities remain: either factoring is in P, or else factoring is one of those "intermediate" problems whose existence is guaranteed by Ladner's Theorem. Most of us incline toward the latter possibility -- though not with as much conviction as we believe P≠NP.
Indeed, for all we know, it could be the case that P = NP intersect coNP but still P≠NP. (This possibility would imply that NP≠coNP.) So, if proving P≠NP and NP≠coNP are both too easy for you, your next challenge can be to prove P ≠ NP intersect coNP!
If P, NP, and coNP aren't enough to rock your world, you can generalize these classes to a giant teetering mess that we computer scientists call the polynomial hierarchy.
Observe that you can put any NP problem instance into the form
Does there exist an n-bit string X such that A(X)=1?
Here A is a function computable in polynomial time.
Likewise, you can put any coNP problem into the form
Does A(X)=1 for every X?
But what happens if you throw in another quantifier, like so?
Does there exist an X such that for every Y, A(X,Y)=1?
For every X, does there exist a Y such that A(X,Y)=1?
Problems like these lead to two new complexity classes, which are called Σ2P and Π2P respectively. Π2P is the "complement" of Σ2P, in the same sense that coNP is the complement of NP. We can also throw in a third quantifier:
Does there exist an X such that for every Y, there exists a Z such that A(X,Y,Z)=1?
For every X, does there exist a Y such that for every Z, A(X,Y,Z)=1?
This gives us Σ3P and Π3P respectively. It should be obvious how to generalize this to ΣkP and ΠkP for any larger k. (As a side note, when k=1, we get Σ1P=NP and Π1P=coNP. Why?) Then taking the union of these classes over all positive integers k gives us the polynomial hierarchy PH.
The polynomial hierarchy really is a substantial generalization of NP and coNP -- in the sense that, even if we had an oracle for NP-complete problems, it's not at all clear how we could use it to solve (say) Σ2P problems. On the other hand, just to complicate matters further, I claim that if P=NP, then the whole polynomial hierarchy would collapse down to P! Why?
Right: if P=NP, then we could take our algorithm for solving NP-complete problems in polynomial time, and modify it to call itself as a subroutine. And that would let us "flatten PH like a steamroller": first simulating NP and coNP, then Σ2P and Π2P, and so on through the entire hierarchy.
Likewise, it's not hard to prove that if NP=coNP, then the entire polynomial hierarchy collapses down to NP (or in other words, to coNP). If Σ2P=Π2P, then the entire polynomial hierarchy collapses down to Σ2P, and so on. If you think about it, this gives us a whole infinite sequence of generalizations of the P≠NP conjecture, each one "harder" to prove than the last. Why do we care about these generalizations? Because often, we're trying to study conjecture BLAH, and we can't prove that BLAH is true, and we can't even prove that if BLAH were false then P would equal NP. But -- and here's the punchline -- we can prove that if BLAH were false, then the polynomial hierarchy would collapse to the second or the third level. And this gives us some sort of evidence that BLAH is true.
Welcome to complexity theory!
Since I talked about how lots of problems have non-obvious polynomial-time algorithms, I thought I should give you at least one example. So, let's do one of the simplest and most elegant in all of computer science -- the so-called Stable Marriage Problem. Have you seen this before? You haven't?
Alright, so we have N men and N women. Our goal is to marry them off. We assume for simplicity that they're all straight. (Marrying off gays and lesbians is technically harder, though also solvable in polynomial time!) We also assume, for simplicity and with much loss of generality, that everyone would rather be married than single.
So, each man ranks the women, in order from his first to last choice. Each woman likewise ranks the men. There are no ties.
Obviously, not every man can marry his first-choice woman, and not every woman can marry her first-choice man. Life sucks that way.
So let's try for something weaker. Given a way of pairing off the men and women, say that it's stable if no man and woman who aren't married to each other both prefer each other to their spouses. In other words, you might despise your husband, but no man who you like better than him likes you better than his wife, so you have no incentive to leave. This is the, um, desirable property that we call "stability."
Now, given the men's and women's stated preferences, our goal as matchmakers is to find a stable way of pairing them off. Matchmaker, matchmaker, make me a match, find me a find, catch me a catch, etc.
First obvious question: does there always exist a stable pairing of men and women? What do you think? Yes? No? As it turns out, the answer is yes, but the easiest way to prove it is just to give an algorithm for finding the pairing!
So let's concentrate on the question of how to find a pairing. In total, there are N! ways of pairing off men with women. For the soon-to-be-newlyweds' sake, hopefully we won't have to search through all of them.
Fortunately we won't. In the early 1960's, Gale and Shapley invented a polynomial-time -- in fact linear-time -- algorithm to solve this problem. And the beautiful thing about this algorithm is, it's exactly what you'd come up with from reading a Victorian romance novel. Later they found out that the same algorithm had been in use since the 1950's -- not to pair off men with women, but to pair off medical-school students with hospitals to do their residencies in. Indeed, hospitals and medical schools are still using a version of the algorithm today.
But back to the men and women. If we want to pair them off by the Gale-Shapley algorithm, then as a first step, we need to break the symmetry between the sexes: which sex "proposes" to the other? This being the early 1960's, you can guess how that question was answered. The men propose to the women.
So, we loop through all the men. The first man proposes to his first-choice woman. She provisionally accepts him. Then the next man proposes to his first-choice woman. She provisionally accepts him, and so on. But what happens when a man proposes to a woman who's already provisionally accepted another man? She chooses the one she prefers, and boots the other one out! Then, the next time we come around to that man in our loop over the men, he'll propose to his second-choice woman. And if she rejects him, then the next time we come around to him he'll propose to his third-choice woman. And so on, until everyone is married off. Pretty simple, huh?
First question: why does this algorithm terminate in linear time?
Right: because each man proposes to a given woman at most once. So the total number of proposals is at most N2, which is just the amount of memory we need to write down the preference lists in the first place.
Second question: when the algorithm does terminate, why is everyone married off?
Right: because if they weren't, then there'd be some woman who'd never been proposed to, and some man who'd never proposed to her. But this is impossible. Eventually the man no one else wants will cave in, and propose to the woman no one else wants.
Third question: why is the pairing produced by this algorithm a stable one?
Right: because if it weren't, then there'd be one married couple (say Bob and Alice), and another married couple (say Charlie and Eve), such that Bob and Eve both prefer each other to their spouses. But in that case, Bob would've proposed to Eve before proposing to Alice. And if Charlie also proposed to Eve, then Eve would've made clear at the time that she preferred Bob. And this gives a contradiction.
In particular we've shown, as promised, that there exists a stable pairing: namely the pairing found by the Gale-Shapley algorithm.
Problem Set
1. We saw that 3SAT is NP-complete. By contrast, it turns out that 2SAT -- the version where we only allow two variables per clause -- is solvable in polynomial time. Explain why.
2. Recall that EXP is the class of problems solvable in exponential time. One can also define NEXP: the class of problems for which a "yes" answer can be verified in exponential time. In other words, NEXP is to EXP as NP is to P. Now, we don't know if P=NP, and we also don't know if EXP=NEXP. But we do know that if P=NP, then EXP=NEXP. Why?
3. Show that P doesn't equal SPACE(n) (the set of problems solvable in linear space). Hint: You don't need to prove that P is not in SPACE(n), or that SPACE(n) is not in P -- only that one or the other is true!
4. Show that if P=NP, then there's a polynomial-time algorithm not only to decide whether a Boolean formula has a satisfying assignment, but to find such an assignment whenever one exists.
5. [Extra credit] Give an explicit algorithm that finds a satisfying assignment whenever one exists, and that runs in polynomial time assuming P=NP. (If there's no satisfying assignment, your algorithm can behave arbitrarily.) In other words, give an algorithm for problem 4 that you could implement and run right now -- without invoking any subroutine that you've assumed to exist but can't actually describe.
| 1.5 | | Randomness | 1.5 | | Randomnessn/a | 1.5 | | Crypton/a | 1.5 | | Quantumn/a | 1.5 | | Penrosen/a | 1.5 | | Hidden Variablesn/a | 1.5 | | Arthur, Merlin, and Friendsn/a | 1.5 | | How Big Are Quantum States?n/a | 1.5 | | Skepticism of Quantum Computingn/a | 1.5 | | Learningn/a | 1.5 | | Interactive Proofsn/a | 1.5 | | Fun With Anthropic Principlesn/a | 1.5 | | Free Willn/a | 1.5 | | Time Traveln/a | 1.5 | | Grand Summationn/a | 1.5 | | Ask me anythingn/a | 1.5 | |
|