"P Hacking" and natural selection

By Joe Felsenstein

March 17, 2023 17:00 MST

Pigeon Peas, often known by the Filipino Ilocano language word *kardis*

“P hacking” is a form of cheating, whether inadvertent or advertent. You test a bunch of chemicals on rats to see whether they cause, say weight loss. You do a statistical test on each, to see whether there is significant evidence for weight loss. The results are judged by the probability P that evidence this strong would arise by chance, if there were actually no effect. You use a small probability, say 0.05, as a sign that the effect is real. But suppose that you have 100 different chemicals, and do 100 separate tests. If none of these chemicals have any actual effect, what is the chance that none of the tests will have P smaller than 0.05?

That is easy to compute: it is 0.95¹⁰⁰, which is 0.00592. The chance that at least one of the 100 tests shows a value of P below 0.05 is actually 0.99408. If what you do is pick that test (or tests) and write a scientific paper reporting the effectiveness of that chemical, you will be committing P-hacking. If none of the chemicals actually works, over 99% of the time you will be able to announce that you have found a weight-loss drug.

All that is relevant to evaluating the meaning of a mathematical paper in the ID journal BIO-Complexity in 2018. It is by George Montañez, a computer scientist at Harvey Mudd College in Claremont, California. It contains mathematical proofs that concepts of Specified Complexity can be considered in a unified way. The unstated implication is that with natural evolutionary processes, high levels of Specified Complexity cannot be expected. Montañez’s paper can be found here.

Why am I bringing up P-hacking? Not to accuse Montañez of it. In fact, the opposite. It is to point out that natural selection acts like P-hacking. Let me explain, first what Montañez’s paper does, then what it says about demonstrating Design, and finally how natural selection’s “P hacking” makes that difficult:

Montañez’s paperPermalink

Montañez’s paper in BIO-Complexity is a detailed mathematical treatment of how one can make a unified mathematical framework for computing a quantity, Specified Complexity, one which brings together Dembski’s original argument, algorithmic specified complexity, and irreducible complexity arguments. I don’t find any mathematical errors in it. (For those interested: he uses a combination of the Neyman-Pearson Lemma with a distribution constructed from a rescaling of the specification function to have unit area under it. From this he finds a rejection region for the null hypothesis. He covers multiple cases. The likelihood ratio needed in the Neyman-Pearson lemma is the ratio of this artificial distribution to the distribution under the null hypothesis. He calls that ratio the “kardis”, for which see the figure caption above.)

How he suggests demonstrating designPermalink

Montañez does discuss testing Design, which, like all ID advocates, he equates with rejecting the hypothesis that the degree of adaptation, or, more mysteriously, the algorithmic simplicity of something-or-other, can be accounted for by natural processes. (For a lengthy rant by me on why the algorithmic simplicity argument seems to make no sense, see an earlier post by me). For Montañez, natural processes may be rejected if we do not have big enough probabilities of the observed adaptations when natural processes operate. The burden of proof is on the biologists:

Simply put, s is the entry fee for a probabilistic mechanism to even enter the tournament of competing explanations (Montañez, 2018, Section 6.1).

No similar criterion is proposed for Design explanations to enter this tournament. In effect, this is simply the common taunt of Design advocates: if we do not have a probabilistic argument from a detailed model of natural processes, we cannot reject Design Intervention. When evolutionists ask for a similar calculation for the supposed Design processes, this is waved away with the argument that we do not understand that Design is not that kind of hypothesis. In bills recently introduced in U.S. legislatures, they want to allow teaching of how Intelligent Design explains biological adaptations. But Design does not have any such explanation, whatever the legislators think.

Why we do not have a detailed probabilistic model of evolutionPermalink

So Montañez only requires evolutionary biologists to have a “probabilistic mechanism” stated in sufficient detail to actually compute a probability p(x) that the observed adaptations would occur. And in addition, able to compute the probabilities that adaptations that are as good or better will evolve. What would we need to know?

Well, we’d need to know all possible relevant features of the environment, including features of organisms that serve as predators, prey, or competitors. And we’d need to know the entire genetics of our organism, including the functions and interactions of all genes. And we’d need to know the full developmental biology of all relevant phenotypes, the full physiology, and functional ecology. That’s all. Obviously we are nowhere near this.

And what alternative do Montañez and his friends propose? Nothing. Not anything involving any of that biology. The classes in which they would teach their actual theory would be pleasantly short, involving just how to say “Poof!”. Once that is done, it is time for a relaxing recess.

Functional InformationPermalink

One of the distributions that Montañez points to is the one used by people who compute Functional Information. This is not a creationist-and-or-ID computation, it is an attempt to quantitate how much biological information is involved in an adaptation. We start with a uniform distribution over all possible genomes. If the genome is, say, 1,000,000 bases long, there would be 4 possibilities at each of those million sites, for a total of 4^1,000,000 possible sequences. That’s 10⁶⁰²⁰⁶⁰, roughly. We assume that all these are equally probable. One needs to know what fraction of those achieve higher specification, such as higher fitness, than the observed sequence. Once we know that fraction P, we can compute functional information as -log₂(P) bits.

If you were to take a random one of those sequences and mutate it randomly, you get another random sequence. If mutation is the only relevant evolutionary process, the distribution of outcomes in the long run is the nearly the same as randomly choosing samples from all possibilities. If we see in the real organism a reasonble fitness, it would have an implausibly high functional information. Applying Montañez’s formulas we would reject random sampling from all sequences as the source of the observed sequence.

Where P-hacking comes inPermalink

Is there some other, biologically plausible, process that could have produced the observed sequence? There is, of course – natural selection. Instead of meandering mutationally generation after generation, P-hacking occurs. Among the sequences available, the ones that have higher functionality (in this case, higher fitness) are more likely to be chosen. If you did this, your teacher would be upset, because the result would be cheating on the exam. But natural selection does this, not just in each generation, but repeatedly. And the results accumulate. Cumulative selection occurs, simply because each generation is the offspring of the previous one. No elaborate scheme is needed: organisms of higher fitness contribute more genes to the next generation, and future organisms are descended from those.

In Montañez’s paper he says about the equiprobable distribution used in computing Functional Information that

Claiming that a probabilistic process boosts the likelihood of observing some particular result remains without force unless one can provide some actual numbers for this claim, namely, what p(x) is under the proposed distribution (or some reasonable lower bound). Thus the burden of providing such estimates resides with those proposing the probabilistic process as an explanation in the first place.

If the observed fitness of organisms is far higher than it would be for genomes typed on four-key typewriters by monkeys, Montañez implies that we cannot simply invoke cumulative natural selection. We’ve got to supply a detailed probabilistic model. Otherwise he will invoke Design Intervention. Which always results in a probability of 1 for whatever we see. And that is the biggest P-hack of all.