Icons of ID: Probability as information

By PvM

July 6, 2004 08:50 MST

In this episode of Icons of ID I will take a quick look at how the definition of information used by ID proponents is nothing more than an argument from probability. In fact when ID proponents claim that chance and regularity cannot create complex specified information (CSI) all they are saying is that such pathways, as far as we know, are improbable. If a pathway is found that is probable, the measure of information, which is confusingly linked to probability decreases. In fact, I argue, that intelligent designers similarly cannot generate specified complex information since the probability of intelligent designers designing is close to 1.

Information and probabilityPermalink

Elsberry and Willkins on CSI

Then again, the choice of the term “complex specified information” is itself extremely problematic, since for Dembski “complex” means neither \complicated” as in ordinary speech, nor “high Kolmogorov complexity” as understood by algorithmic information theorists. Instead, Dembski uses \complex” as a synonym for “improbable”.

So how does Dembski define information?

$I(X)= - \log_2 P(X)$

So in other words, information is the log of the probability. But what probability is this? Others have shown how Dembski is unclear on this issue and often moves between uniform probabilities or actual probabilities, whenever it seems better to do so. Dembski mentions in NFL that this measure of information is similar to Shannon information. In fact Shannon’s entropy is the average of Dembski’s information measure. This confusion about information and entropy is not limited to Dembski’s writings however so let’s look at Shannon entropy and information in more detail.

Claude: Shannon: A mathematical theory of communicationPermalink

In 1948 Shannon published one of his seminal papers on A mathematical theory of communication.

Shannon shows that the logarithm is the natural choice for expressing the concept of information. Entropy, a weighted measure of information is basically the expected value of information present. In other words

If there are $n$ messages $X= {X_1, ..., X_n}$ with probabilities $p(X_1) ... p(X_n)$ then the Shannon entropy of this set is defined as:

$H(X)= -p(X_1) \log_2 p(X_1) + ... + p(X_n) \log_2 p(X_n)$

or in other words

$H(X)= E ( I(X)$

Entropy is maximum when all values are equiprobable.

Information is defined as

$I_S(X)=H_{max} - H(X)$

Information in the Shannon sense is defined as the change in entropy before and after a particular event has taken place. Shannon information, also known as surprise, is any form of data which is not already known. In fact, when rare events occur, they generate a lot of information.

Tom Schneider has some good resources:

So what we have learned so far is that Dembski’s information measure is nothing more than a probability measure similar to Shannon’s entropy measure not Shannon’s information measure.

But the choice of the term information is quite unfortunate since it has more similarity to entropy than to Shannon information.

So let’s try to understand why Dembski argues that regularity and chance cannot create CSI. The answer is simple: If such processes have a high probability of being successful, their Dembski information measure will be low.

But the same problem applies to Intelligent designers. Given a particular ‘intelligently designed’ event, its probability is high and thus its information is low. In other words, according to Dembski’s own measure, nothing can create CSI other than pure chance.

Not much of a useful tool but the poor choice of the information measure has caused much unnecessary confusion. When in fact all Dembski was doing is repeating the age-old creationist argument that evolution or abiogenesis is improbable.

Talkorigins has some good FAQ’s on what’s wrong with these arguments.

It seems that ID is not only fundamentally flawed due to a theoretical failure of its claims but also empirically flawed in that ID has failed to be scientifically relevant. But in addition to these flaws, we also recognize the flawed arguments of Dembski based on probability. All because of the confusing usage of terms like information rather than entropy. Seems that the intelligent designer is as powerless in creating CSI. Or alternatively an intelligent designer is as capable of creating CSI as regular processes.

Tom Schneider attracted Dembski’s ire for showing how the simple processes of variation and selection can actually increase the information in a genome.

Dembski’s complexity measures have many problems.

Surprisingly various ID proponents such as for instance Fred Heeren seem to have taken Dembski’s claims too seriously.

Heeren quotes another unsupported and in fact falsified claim by Dembski

William Dembski puts it this way: “Specified complexity powerfully extends the usual mathematical theory of information, known as Shannon information. Shannon’s theory dealt only with complexity, which can be due to random processes as well as to intelligent design. The addition of specification to complexity, however, is like a vise that grabs only things due to intelligence. Indeed, all the empirical evidence confirms that the only known cause of specified complexity is intelligence.”

Careless usage of terminology, contradictory statements and examples, confusing usage of terms and inflated claims all seem to have made the design inference ‘quite problematic’.

Branden Fitelson

Understanding what “regularity,” “chance,” and “design” mean in Dembski’s framework is made more difficult by some of his examples. Dembski discusses a teacher who finds that the essays submitted by two students are nearly identical (46). One hypothesis is that the students produced their work independently; a second hypothesis asserts that there was plagiarism. Dembski treats the hypothesis of independent origination as a Chance hypothesis and the plagiarism hypothesis as an instance of Design. Yet, both describe the matching papers as issuing from intelligent agency, as Dembski points out (47). Dembski says that context influences how a hypothesis gets classified (46). How context induces the classification that Dembski suggests remains a mystery.

Elsberry and Shallit have written an excellent paper “Information Theory, Evolutionary Computation, and Dembski’s “Complex Specified Information”. They address Dembski’s fallacious reliability claims, present the differences between rarefied design and ordinary design, and the problems with apparant and actual complex specified information (CSI).

Intelligent design advocate William Dembski has introduced a measure of information called “complex specified information”, or CSI. He claims that CSI is a reliable marker of design by intelligent agents. He puts forth a “Law of Conservation of Information” which states that chance and natural laws are incapable of generating CSI. In particular, CSI cannot be generated by evolutionary computation. Dembski asserts that CSI is present in intelligent causes and in the flagellum of Escherichia coli, and concludes that neither have natural explanations. In this paper we examine Dembski’s claims, point out significant errors in his reasoning, and conclude that there is no reason to accept his assertions.