The Evolution of Dembski’s Mathematics

| 14 Comments

In light of William Dembski's recent technical paper (the first of seven!), and the protracted discussion that is taking place in the comments section of this recent post from PZ Myers, this may be a good time to review his various mathematical arguments in favor of ID.

William Dembski began his anti-evolution career by introducing the notion of complex, specified, information (CSI). CSI was said to be something that might be possessed by an object or event, and such possession was a reliable indicator that intelligent design was implicated in the causal history of the possessor. When this is coupled with the assertion that various biological systems exhibit CSI, we have an argument against evolution. In the remainder of this post I will refer simply to “objects” rather than “objects and events”.

An object was said to be complex if the probability of its occurrence with respect to known natural laws was below a certain, absolute, lower bound. Specification then indicated that the object matched some pattern describable independent of the object in question. Dembski recognized the difficulty in distinguishing the genuine patterns from the “fabrications” (his term), and offered some elaborate statistical arguments that were supposed to help us avoid this dilemma.

Dembski's arguments at this point used very little mathematics. There was much mathematical formalism, certainly, but substantively there was only some elementary probability theory and statistics.

Every aspect of this argument is mistaken, as shown by numerous philosophers and scientists since Dembski first presented his ideas. Dembski's claim that CSI is both well-defined and reliably indicates design is highly dubious, but here I will only address the problems with applying his ideas to biology.

To apply Dembski's method to a particular biological system we must first do a probability calculation to show that the system is complex. As a practical matter is it is impossible to carry out such a calculation (at least, not if you want your result to have some connection to reality). There are too many variables to consider. Furthermore, it is not clear what you are finding the probability of, exactly. For example, if you are considering the probability of evolving the flagellum possessed by E. coli, are you interested in obtaining that precise configuration of proteins, any configuration that would make a workable flagellum or any motility system at all?

Dembski's writing is conflicted on the subject of whether the system's specification comes before or after the determination that the system is complex. For example, in his infamous explanatory filter it is quite explicit that the determination of the system's complexity comes before the determination that it is specified. In other places, however, specification comes first.

Leaving that aisde, the whole notion of specificity is problematic. In arguing that the biological systems he considers are specified Dembski invariably relies on the system's function. For example his specification of the bacterial flagellum consists, in its entirety, of the observation that the flagellum is rather similar to an outboard motor. If this is different from looking at a cumulus cloud and seeing a dragon, it is not clear how.

More generally, to distinguish the genuine patterns from the fabrications, we need a fair amount of background information. For example, when we look at Mt. Rushmore and immediately conclude that it was designed, we do so in part because we know what mountains look like when people don't carve faces into them. It is only when compared to those other mountains that we conclude that Mt. Rushmore possesses a design-suggesting pattern. Similarly, one thousand heads in one thousand tosses of a coin suggests skullduggery because we have all flipped coins before and we know that usually we get some random mixture of heads and tails.

That experience is precisely what is lacking in considering biological systems. Consequently, there is no reliable way of distinguishing the genuine patterms from the ones imposed by human imagination. Drawing a vague analogy between the system and some human invention just doesn't cut it.

Dembski presented the idea of CSI in The Design Inference (1998). He described its connection to biology in Intelligent Design(1999).
My review of Intelligent Design, originally published in Skeptic magazine, is available here. And that's where things stood until 2002 and the publication of No Free Lunch. Here Dembski introduced two new mathematical arguments.

The first was a specific example: the aforementioned flagellum. Dembski attempted to show rigorously that the flagellum was both complex and specified. His calculation was based on a mountain of false assumptions, one of which merits particualr discussion.

Dembski made explicit use of Michael Behe's ideas about irreducible complexity (IC) to rule out the idea that the flagellum evolved gradually in the Darwinian manner. Behe's argument was that if a system was composed of several well-matched, indispensable parts, then it is not plausible to suggest that it evolved gradually by natural selection. The idea was that a system that has no function until all of the parts were in place could not be favored by selection.

In replying to Behe's arguments scientists have made two main points: (1) Behe is wrong as a matter of logic, since the fact that a system is irreducibly complex in the present tells us nothing about possible functional precursors in the past. (2) For certain specific examples used by Behe, such as the blood clotting cascade and the immune system, actually quite a lot is known about their evolution.

But no one responded to Behe by saying that IC systems, indeed, could not evolve gradually, but we need a dubious, back-of-the-envelope probability calculation to convince us that such systems were actually designed. In other words, in basing his calculation on Behe's ideas, Dembski established that his musings about CSI had no significance beyond what Behe had already argued. If Behe is wrong about IC, then Dembski's calculation is obviously based on false assumptions. If Behe is right about IC, then Dembski's calculation is unnecessary.

So Dembski was completely wasting everyone's time when he told us that CSI was relevant to the evolution/creation debate.

The second contribution of No Free Lunch was his invocation of the No Free Lunch Theorems. This represented a ratcheting up of the level of mathematical sophistication needed to understand Dembski's arguments. The No Free Lunch theorems are genuine results in optimization theory, and understanding what they say requires a certain amount of work.

In short, the NFL theorems assert that the performance of any given search algorithm, when averaged over all possible fitness landscapes that it might confront, is no better than blind search. Dembski's then argued that the process of sifting random genetic variations through the sieve of natural selection can then be viewed as an algorithm for searching the space of possible genotypes. Viewed in this way, the algorithm clearly outperforms blind search. It follows from the NFL theorems that the algorithm must have been rigged with some additional information. But from where did this information come?

Understanding the NFL theorems may require some work, but the basic flaw in Dembski's argument is easy, even trivial, to spot. It is that the NFL theorems only tell us something about the avergae performance of a fitness algorithm over all possible fitness landscapes. It tells us nothing about the performance of an algorithm on any given fitness landscape. End of discussion.

There are further difficulties, such as the fact that in evolution the fitness landscape coevolves with the organisms residing upon it, but there is no need to get into that here.

My review of No Free Lunch was originally published in the journal Evolution, and is available here.

The technical paper under discussion in PZ Myers' post represents Dembski's first new mathematical argument since No Free Lunch. As pointed out in that post, Cosma Shalizi has done an excellent job of summarizing what is wrong with the paper. Dembski has already responded.

Before making a few remarks about Dembski's reply, let me first say a word or two about qualifications. Prior to this paper, none of Dembski's arguments required any special training in mathematics to understand and refute. CSI was based on nothing more than elementary probability theory. Reading the original Wolpert and Macready papers in which the NFL theorems were presented certainly requires some expertise, but those theorems can be easily explained in non-techinical language. Once that is done, the flaw in Dembski's argument should be clear to anyone.

The present paper is different. Understanding its content requires a fair grasp of measure theory, which is a branch of mathematics most people have never heard of, much less know anything about. So to comment effectively on the paper requires considerable mathematical training.

I am a professional mathematician. My research is in combinatorics and number theory. The last time I thought about measure theory was when I took my qualifying exam in analysis as a first-year graduate student. Nonetheless, this was adequate preparation for reading Dembski's paper. You don't need a specialist in measure theory or information theory to understand the actual mathematics in the paper.

What you do need a specialist for is determining how Dembski's claims fit into the bigger picture of current research in the field. As a combinatorialist, I have no idea what sorts of problems information theorists are working on these days (and I don't think they know much about what is going on in my own beloved field of algebraic graph theory). Happily, Cosma Shalizi is an expert in this field, and he did an admirable job of showing that Dembski's paper is mostly devoid of content.

The only thing I would add to Shalizi's comments is the observation that the present paper has absolutely nothing to do with evolution or ID. No doubt that application is coming (six more papers, remember). When it does come, I suspect the problem will turn out to be that whatever probability spaces Dembski defines for the purpose of modelling evolution will turn out to be far too simplistic. Time will tell.

Now, a few words about Dembski's reply to Shalizi's post. At the end of his reply Dembski describes three main contributions of his paper. That is precisely the sort of thing you need an expert to assess, so I will leave that to Shalizi (or anyone else who cares to weigh in).

But the rest of Dembski's response is pure, disingenous sleaze. Shalizi points out that Dembski's variational information is nothing more than a special case of a definition first proposed by Renyi four decades ago. Dembski replies that he never claimed his definition was new. And that is true. Nowhere in the paper does he claim that his definition is new. But when you consider that the motivation for and derivation of his definition, along with its elementary consequences, occupies nine pages of an eighteen page paper (the last two of those pages consisting almost entirely of bibliography), you could be forgiven for thinking that Dembski was presenting something new. When you further consider that the paper's abstract lists as one of its main accomplishments the generalization of information to probability distributions...well, you get the idea.

What seems clear is that Dembski had never heard of Renyi's definition prior to writing the paper. If he had, he not only would have cited Renyi's work, he would have made a point of explaining how his work provides some new gloss on Renyi's ideas. Equally clear, as Shalizi points out, is that Dembski has little idea about many of the standard tools of the fields in which he claims to be an expert.

A standard part of doing research is to scour the literature looking for anticipations of your result. You do not first submit the paper, post a draft at your website, then ask for bloggers to look for flaws. Dembski says he thought his result might be old. Fine. Don't submit the paper until that question is resolved.

He then goes on to explain that the present paper is intended as the first chapter of a monograph, and as such it was simply laying down foundational ideas. But the fact remains that Dembski thinks this paper is sufficiently strong on its own to send it to an actual journal. When you submit a paper and the referree points out that it is almost completely devoid of content, you don't reply by saying, “Devoid it may be, but wait until you see the six other papers that are coming down the pipeline.”

Dembski next defends himself from the charge that his paper contains no new mathematics by pointing out that Complexity is not a mathematics journal. This is just sophistry, but at least Demsbki is admitting that there is no new mathematics.

The next step is to criticize research mathematicians for being concerned with generalizations of prior work for its own sake, without any regard for possible applications. That's silly, but even sillier is his implication that Shalizi is incapable of understanding that Dembski is interested in applications, as opposed to the mathematical apparatus itself. That's silly partly because Shalizi is a physicist, but also because, as I mentioned, the bulk of the paper is given over to mathematical derivations, not applications. The paper contains nothing but a vague hint of how Dembski intends to apply his ideas.

If Dembski really saw his contribution, as he now claims, as providing a new application of old ideas, then he would have discussed the older applications of those ideas in his paper. In other words, he would have said something like “In the past, Renyi's information measure has found applications X, Y and Z. In the current paper I propose the new application A.” He did not discuss any of the applications Shalizi mentioned because he was not familiar with any of them, and the reason he was not familiar with them is because he is not the expert he claims to be.

14 Comments

Is there anything novel in the paper (I know it’s not a real paper, but just to be simple) at all? Just curious. I have yet to hear an actual theory from the ID people.

I’ve come up with an alternative to “intelligent design” which remedies shortcomings. My theory is “miraculous rearrangement”. A Google search suggests that the concept is new to biology although it has been applied to geology. My theory is more general than ID and doesn’t require a “mysterious designer”. It can explain saltations just as well as ID. The theory is in early stages of development. Perhaps ID supporters can help me crystallize the relationship between ID and “miraculous rearrangement”. In particular I’m interested methods for distinguishing which of the two hypotheses best descrbes reality.

In regards to Mt. Rushmore- I maintain that a better example is Vermont’s (late) Old Man in the Mountain.

Was that designed or not?

How do we know that some native american did not improve an odd outcropping to look like his Mother-In-Law?

“I maintain that a better example is Vermont’s (late) Old Man in the Mountain.

Was that designed or not?”

Why focus on the late Old Man in the Mountain?

The new one is just as likely designed by an incredibly powerful designer who works in mysterious ways as the old one, no? The old one was intended by the Great Designer to show us what old humans would look like in profile. The new design is intended to show us what the next dominant race of beings will look like, or maybe to show us what the designer himself looks like. Right?

Prove me wrong.

Jason,

For us mere mortals, what is a Kantorovich-Wasserstein metric?

MB

Um, Vermont’s “Old Man” had to be designed - since it exists only in some imaginations.

(Of course, the purported false positive is really New Hampshire’s icon, immortalized on its license plate.)

To Michael Buratovich: Both L. Kantorovich and L. Wasserstein are (were?) Soviet mathematicians, Kantorovich quite famous, Wasserstein less famous. The metric in question was introduced by Kantorovich in 1942 in a short article in Doklady Akademii Nauk (in Russian). Wasserstein’s further elaboration came in 1969, also in Russian. I wouldn’t even start an attempt to explain it in a few words to those without a special background (more so because I am far from being versed in that matter well enough beyond a very general idea). It has to do with a bunch of other non-trivial concepts like weak topology, etc. There is, as far as I know, a reasonably good explication of it in English in a book by I. Vaida, Theory of Statistical Inference and Information, Kluwer 1989. Maybe somebody else will endeavor to explain it in detail and in simple terms? Anyway, all Dembski’s math seems to be utterly irrelevant to ID.

A nit:

Wasn’t the old man in the mountain in New Hampshire?

Antonio

Perakh Wrote:

Both L. Kantorovich and L. Wasserstein are (were?) Soviet mathematicians, Kantorovich quite famous, Wasserstein less famous.

Kantorovich won the Nobel prize in Economics; he was the advisor of one of my mentors at UCSB, Dr. Rachev, who is very accomplished in his own right (in fact, he makes the anti-Dembski contributors here look like academic bottom-feeders).

Maybe thats why so many seem to appreciate your presence and contributions here: to make us ‘bottom feeders’ look better :-)

Arrgh! Yes the OMITM was in New Hampshire. I knew it was on the license plate and did not think that “Live Free or Die” was there two, so I put it in Vermont.

Sorry about that.

Michael Buratovich Wrote:

For us mere mortals, what is a Kantorovich-Wasserstein metric?

1. Informally, a metric is just a way to measure the “distance” between two points in a set. There’s basically no end to the sets mathematicians want to have metrics (ways to measure distance) for.

2. Suppose we have a set of points and way to measure the distances between two such points.

3. Suppose, further, that we have a set of probability measures, each of which assigns probabilities to the points.

4. Now we want to take the way to measure the distance between points, and use it to construct a way to measure the distance between probability measures. The Kantorovich-Wasserstein metric is what you get if do the construction in a particular way.

Since O’Brien claims to be a student of Rachev, and the latter (as O’Brien asserts) was a student of Kantorovich, perhaps O’Brien will do all of us, lightweights and bottom feeders (these, I believe, were his characterizations) a great favor by explaining in a clear but mathematically rigorous manner the essence of Kantorovich-Wasserstein metric as distinct from other versions of metrics. Just for change - instead of indulging in spitting out snide remarks. We are waiting.

Mark Perakh Wrote:

Since O’Brien claims to be a student of Rachev, and the latter (as O’Brien asserts) was a student of Kantorovich, perhaps O’Brien will do all of us, lightweights and bottom feeders (these, I believe, were his characterizations) a great favor by explaining in a clear but mathematically rigorous manner the essence of Kantorovich-Wasserstein metric as distinct from other versions of metrics. Just for change - instead of indulging in spitting out snide remarks. We are waiting.

Once I get settled here at my new university and have the time I will give it a shot.

About this Entry

This page contains a single entry by Jason Rosenhouse published on August 12, 2004 12:39 AM.

Dembski’s consistent inconsistencies was the previous entry in this blog.

Why Dembski should more often look at a mirror is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.

Categories

Archives

Author Archives

Powered by Movable Type 4.361

Site Meter