Weasles on Parade

| 159 Comments | 2 TrackBacks

It is now 62 hours since William Dembski posted that the Evolutionary Informatics lab was going to try and reproduce Dawkins Weasel Program according to how it was actually written, as opposed to their fantasy version. In that time I’ve resurrected an elderly program, and several readers have made their own weasels from scratch. Commenter Anders has even made a Python version that puts “freely mutating” and “locking versions” head to head with great graphs. (Update: Wes Elsberry did a head to head comparison last year, see here for his comparison [scroll down], it differs from my implementation but the basic message is the same).

I’ve gone back and done a head to head comparison myself between a program with no “locking” (all letters in any given string have a chance to be mutated) and one with “locking” (where the matching letters are preserved against mutation). Trying to implement “locking” al la Dembski proved too hard. You have to keep indices of the letter locations and keep updating them. It is such a pain in the bottom to try and do this that I cannot imagine Dawkins even wanting to try and program a “Locking” implementation in GBASIC. Remember, Dawkins weasel was a quick and dirty program bashed out in a short time. To implement “locking” I just kept a copy of the parent string unmutated (after all, in the real world not every offspring has mutations in genes of interest).

So what happened?

weasel_dynamics.png

As you can see they were much the same (mean and standard error of an average of 4 independent runs per condition shown, click to embiggen). “Locked” runs finished earlier, on average but most of the trajectory of the run was determined by mutation supply. As you can see, runs done with locked and unlocked versions fell within the error bars of each other, for runs that set the Offspring number at either 100 or 30.

This is easy to understand. Early on, virtually any mutation is of big benefit, while later on most mutations are of small benefit. Only in the final stages that there was any significant backing and forthing, and then only in the 30 Offspring case. In the 100 offspring case, the population was so large that the probability was high that even in the final stages a beneficial mutation would be acquired. Wesley Elsberry has done the mathematics explicitly (go to this link and scroll down). Only in the 30 offspring case was the last 10 or so generations in an “unlocked” run spent bouncing from 3-1 differences.

So Dembski’s claims that a list of every 10th output string shows that Dawkins is using a “locking” program are nonsense, Dawkins results are exactly what we would expect to see when sampling sparsely from a high population “freely mutating” weasel run.

weasel_convergance_time.png

Slightly more surprisingly, the median time to convergence (when the program finally reached the target string), was not actually statistically significantly different. I suspect when enough runs are accumulated (say around 20), they will be, but the “locked” runs are only about 20% faster.

So, summary. Whether you “lock” your strings, or allow them to mutate at all positions freely, a weasel program will converge on a solution. The for most of the time, mutation supply dominates, and whether you lock your strings or not there is rapid accumulation of beneficial mutations. Only at the very end run does “locking” matter, and then only for small populations where the probability of a beneficial mutation is low.

But even then, “unlocked”, freely mutating programs will converge on a solution in less than a minute (and only 20% slower than the “locking” programs), when simple random sampling will take longer than the lifespan of the Universe to converge.

Still, the point is that against all evidence, Dembski believes that Dawkins wrote a overly complicated program in GBASIC, and then reverted to a simple one for a TV show, and can’t be bothered to get around to writing the code that would show him wrong (and show that freely mutating programs converge rapidly on the target string). Again, this shows that cdesign proponetists don’t understand the “test your hypothesis” part of science. I’m currently reading Galileo’s “Starry Messenger”, and the debate over at Uncommon Descent reminds me of Francesco Sizzi, who declared that Galileo really couldn’t have seen Moons around Jupiter, as metaphysics demands there only be 7 planets.

Now If only they would put the same degree of effort that they put into critiquing the weasel program, a toy demonstration of selection the same importance as the measuring cylinder/running tap model of drug clearance, into doing some real science.


The freely mutating QBASIC program is here, the “locked” version is here (both write strings to a log file).

Feel free to submit your own weasels, especially ones that compare freely mutating vs “locking” versions of the code, and the PT crew could do with a java applet version if you could do one. A weasel for every 6 hours the Uncommon Descent people do nothing would be great!

See Wesley Elsberries superb Java Weasel as an example!

2 TrackBacks

Ian Musgrave has a brilliant post showing that Dembski's revisiting of the old creationist canard that Dawkins' 1984 Weasel program, designed to show that random variation and selective retention can "evolve" a target phrase, in this case Shakespeare's... Read More

Giving the GIF of Life from Science After Sunclipse on April 15, 2009 10:46 AM

The following is an edited repost of an item I originally wrote last year. Sometimes, my brain needs to take a break from work, from arguing on the Internet and all the rest. On these occasions, I like to take... Read More

159 Comments

Does anyone have the original Apple Basic source code? I found many versions on the web but can’t seem to find the original version.

I still have a functioning Apple ][ computer that I would love to put the source code on.

Weasels are weasley recognizable but stoats are stoatally different

Now we know why Dawkins called it a weasel program. We’re all waiting to see Dembski try to weasel out of this again.

Thanks for the link to the Python version by Anders, I spent some time making pretty much the same thing yesterday out of curiosity (and as a way to teach myself Python :p).

One big caveat, however : if I understand Dawkins’ original description correctly “[…] each position [in each offspring] has a chance of randomly mutating, based upon a set mutation rate.” (from http://www.antievolution.org/featur[…]asel102.html).

Anders’ version, however, seems to randomly pick a single position in each offspring to be mutated. This seems to be a pretty different algorithm, unless I’m completely misunderstanding either the original idea or his code.

This doesn’t really invalidate the conclusions regarding locking / non-locking versions but I thought this was worth pointing out (even though the resulting graphs tracking progression towards maximum fitness look much the same in my version with a possibility of mutation on each position of each offspring).

This may be entirely irrelevant but here’s what I did many years ago:

When playing a computer game, I think it was ‘Shivers’, one of the hurdles was a game consisting of a horizontal row of ‘holes’, with a number of balls that based on certain rules could be flipped back or forth between vacant holes.

After playing for a while I decided I was lost and wanted to restart - but I had not saved the start configuration of balls so I decided to let the computer solve it for me.

So I wrote a simulation program to toss the balls around according to the rules, and stop at the pre-determined target configuration.

I must have use a randomiser of some sort and the program found the solution for me.

When I examined a printout I found many long repeating and re-repeating sequences that I could eliminate until what was left was a straight sequence that I then entered in the game to move the balls from their present position to the desired end positions. Bingo.

If the program’s output could have been linked to the game it would of course have done the job all by itself, with a lot of redundant, insignificant steps in between.

I don’t remember whether I used assembler, Pascal, Qbasic or Foxbase.

Bethor said:

Thanks for the link to the Python version by Anders, I spent some time making pretty much the same thing yesterday out of curiosity (and as a way to teach myself Python :p).

One big caveat, however : if I understand Dawkins’ original description correctly “[…] each position [in each offspring] has a chance of randomly mutating, based upon a set mutation rate.” (from http://www.antievolution.org/featur[…]asel102.html).

Anders’ version, however, seems to randomly pick a single position in each offspring to be mutated. This seems to be a pretty different algorithm, unless I’m completely misunderstanding either the original idea or his code.

[…]

Yep - that’s correct. My python version picks exactly one position in each offspring. I think I’ll change it to the mutation rate version and update the graphs.

Oh - and excellent choice with the python by the way!

Bethor said:

[…]

One big caveat, however : if I understand Dawkins’ original description correctly “[…] each position [in each offspring] has a chance of randomly mutating, based upon a set mutation rate.” (from http://www.antievolution.org/featur[…]asel102.html).

Anders’ version, however, seems to randomly pick a single position in each offspring to be mutated. This seems to be a pretty different algorithm, unless I’m completely misunderstanding either the original idea or his code.

This doesn’t really invalidate the conclusions regarding locking / non-locking versions but I thought this was worth pointing out (even though the resulting graphs tracking progression towards maximum fitness look much the same in my version with a possibility of mutation on each position of each offspring).

OK, I’ve now modified my programs so they follow the original Dawkin’s description more closely: in each generation each position in each offspring has some probability of mutating (so offspring can now have zero, one, or more mutations compared to their parent). I’ve also updated the plots and links to downloadable software.

Incidentally, I might note that with this setup it is possible to set the mutation rate so low that some offspring in each generation will be the same as their parent. In this case fitness will never decrease (the worst case is to stick with the parental genotype thus maintaining fitness).

It’s great that Ian has posted the discussion of Weasel. It is one of the best pedagogical tools for illustrating the effects of selection without requiring an extensive mathematical or physics background.

If I may summarize a few points I made in some of my previous comments on the other thread covering this, those interested in how this exercise can be integrated into a bigger picture might want to try a few tests with their programs.

One. The selection process represented here is only one of literally thousands of ways to cause a random process to converge to an equilibrium state. In this case, the fact that the string has “meaning” as a recognizable sentence is irrelevant. Any target string could be used.

Two. The solutions to equations (e.g., roots that set the equation equal to zero) are mathematical examples of looking for an “equilibrium” condition. Randomly throwing numbers at the equation is highly unlikely to produce a root. However, pulling candidate roots from domains that are restricted according to how close the equation approached zero when a candidate was selected from a prior set produces a sudden and dramatic convergence to “equilibrium” (locating the root).

Three Physical processes in which matter condenses into self-organized patterns already has a “built-in mechanism” for selecting candidates for the next round of trials; namely, the second law of thermodynamics.

Four There are some tests that one can do on any selection model to see if it has any characteristics of physical reality. These rely on some general characteristics of physical systems that are not necessarily characteristic of some artificial or mathematical algorithms. If the outcomes of these tests indicate some of the features of real, physical systems, then the algorithm can be place in a set of candidate algorithms for use in a more sophisticated model.

Five A caveat in this process is that some algorithms can mimic physical reality closely enough that they are difficult to distinguish from actual physical mechanisms in a stochastic set of tests.

Given these general notions, here is the general recipe for exploring whether or not an algorithm captures any physical reality.

A Something in the model has to be identified as what physicists refer to as a “generalized force” that drives the system toward equilibrium. It could be a “potential energy” or some kind of mechanism that “pushes” the system in the direction of equilibrium.

B The larger this “generalized force”, the more rapidly the system proceeds toward equilibrium. As the system approaches equilibrium, the generalized force diminishes until it becomes zero at equilibrium.

C In any stochastic process, there are fluctuations in the generalized force. These fluctuations generally increase with the magnitude of the generalized force.

C In realistic systems, one expects the ratio of the magnitude of the fluctuations in the generalized force to the magnitude of the generalized force itself to go like the reciprocal of the square-root of the number of contributors to the generalized force. (This assumes that the magnitude of the generalized force is linearly proportional to the number of “carriers” of unit quantities of this generalized force.)

D Therefore, in running the model, it is necessary to capture data on not only the magnitude of the generalized force at each point in the system’s evolution, but also data on how it fluctuates at each point. This means collecting a representative ensemble of the values of the generalized force at each point and calculating its mean and standard deviation.

E Plotting the magnitudes of the fluctuations against the magnitudes of the generalized force itself on a log-log plot will produce a straight line with a slope of negative one-half if the model is behaving anything like physical reality.

There are many other tests one can make that get into the fractal natures of physical systems, but this is a start for those beginning to play around with these kinds of models.

More generally, however, it is probably better to try to build a model from the ground up based on known or suspected physical mechanisms. However, the exercise with Dawkins’ Weasel program is a legitimate form of investigation because it approaches the problem from the perspective of cataloguing candidate models for later use.

The graph shown here plots the probability that a population will have at least one candidate that preserves all the correct characters from the parent string. The graph shows population sizes from 1 to 500, mutation probabilities from 0.0 to 1.0, and is done for the case where the number of correct characters in the parent is 27. Once the population is around fifty, increases in population size make very little difference.

If one runs a “weasel” with a reasonable population size and mutation rate, it is highly likely that there will be some candidate in the population that at least keeps the correct characters from the parent, and without the use of elitism (where an unmutated daughter copy is inserted into the next generation). The breakover point for the mutation rate is where it starts becoming likely that more than one character will be changed per candidate.

Anders said: OK, I’ve now modified my programs so they follow the original Dawkin’s description more closely: in each generation each position in each offspring has some probability of mutating (so offspring can now have zero, one, or more mutations compared to their parent).

Great ! Thank you :)

I’ve been playing around with my version of this (it beats fixing bugs in my C++ code for work ;)) and I’ve come to an interesting (to me, at least) observation regarding the mutation rate.

Predictably, increasing the mutation rate increases the “speed” of the algorithm (i.e. fitness increases faster), decreasing the mutation rate slows down the process.

What surprised me however was how quickly a higher mutation rate becomes detrimental (at least on longer target phrases than the fairly simple “methinks it is like a weasel”). I haven’t implemented tracking of beneficial and negative mutations like you have but I suspect they simply cancel each other out in this case.

In retrospect I suppose this should have been obvious; it certainly is if you try and imagine the edge case of 100% mutation chance : this would essentially eliminate the effects of selection and return us to the “monkeys and typewriters” algorithm.

Anyways, I know this is too contrived and simple an algorithm to draw any parallels to actual biology, but it sure is fun to play with !

Wesley R. Elsberry said: The breakover point for the mutation rate is where it starts becoming likely that more than one character will be changed per candidate.

Looks like you answered my implicit question while I was trying to make my previous post readable… thanks ! :)

Should that headline refer to “Weasles” or “Weasels”? Or is that an intelligently-designed typo?

Wesley R. Elsberry said:

The graph shown here plots the probability that a population will have at least one candidate that preserves all the correct characters from the parent string. The graph shows population sizes from 1 to 500, mutation probabilities from 0.0 to 1.0, and is done for the case where the number of correct characters in the parent is 27. Once the population is around fifty, increases in population size make very little difference.

[…]

Excellent - I’d love to see this analysis. But I seem not to be able to view the plots on this page? (I tried with two different browsers on my mac). Am I lacking some plug-in?

72 hours and still no weasels over at uncommon dissent.

72 hours and still no weasels over at uncommon dissent.

Bill’s a busy man. Soooo much simple evidence to ignore, so little time…

Anders said: Excellent - I’d love to see this analysis. But I seem not to be able to view the plots on this page? (I tried with two different browsers on my mac). Am I lacking some plug-in?

No, the graphs are blocks of black with white areas and fuzzy grey bits at the interface. They might need a border to make them clearer.

Heads-up: on the Python program, it works for 2.7, but on the final line, the print statement should be enclosed in parentheses if you’re running Python 3.0.

Ian Musgrave said:

72 hours and still no weasels over at uncommon dissent.

I’m currently reading Galileo’s “Starry Messenger”, and the debate over at Uncommon Descent reminds me of Francesco Sizzi, who declared that Galileo really couldn’t have seen Moons around Jupiter, as metaphysics demands there only be 7 planets.

I looked at the conversation over there. It’s like being in a hermetically sealed room in which all the air has been sucked out and you can’t find the door.

The graphs are in PNG format. I’m not positive what browsers know about PNG, but Firefox 3 on Mac OS X certainly does.

stevaroni said:

72 hours and still no weasels over at uncommon dissent.

Bill’s a busy man. Soooo much simple evidence to ignore, so little time…

Perhaps he’s using the Darwinian programming methodology. (This is the one where you start with a BASIC program of about the right number of lines, and just change statements at random until it produces the right output :)

Wesley R. Elsberry said:

The graphs are in PNG format. I’m not positive what browsers know about PNG, but Firefox 3 on Mac OS X certainly does.

I’m going over your calculations to see if I can work out what the fluctuations will be. I noticed a probability calculation that might be incorrect but doesn’t appear to play a part in your overall calculation.

Shouldn’t Ptry is all correct = (1/K)L = K-L instead of L-K?

Similarly for Pa try in population all correct which just multiplies the above by N.

Maybe writing even simple programs requires a “pathetic level of detail” that the folks at the Evolutionary Informatics lab are unable to match.

DS said:

Now we know why Dawkins called it a weasel program. We’re all waiting to see Dembski try to weasel out of this again.

“Marge, dont discourage the boy. Weaseling out of things is important to learn. Its what separates us from the animals…except the weasel.” Homer Simpson

Answers in Genesis implemented a fancy version of weasel <note spelling> several years ago, complete with options for population size, fixed vs variable mutation rates, modelling amino-acids, etc. The program worked well (as long as you didn’t mind the copious garbage in the accompanying text) and I still have a copy.

AFAICT AiG did this to ‘demonstrate’ that in reality, you’d never ever reach any optimal solution because after a while the probability of the next generation containing a fitter individual was the same as the probability that all members of the next generation were less fit than the parent, at which point the simulation would simply oscillate back-nd forth, and never reach the set target. They called this ‘error catastrophe’. (They cheated, of course, by selecting a set of options that didn’t reflect real life; however, I don’t think the person who developed the program got the memo, since everything AiG claimed could rapidly be refuted simply by choosing more appropriate options).

So… why didn’t Dembski just use AiG’s implementation, since it manifestly does not include locking?

Roy

Bruce said:

Should that headline refer to “Weasles” or “Weasels”? Or is that an intelligently-designed typo?

That’s a random mutation. Selection hasn’t acted on it yet.

MememicBottleneck said:

That’s a random mutation. Selection hasn’t acted on it yet.

With just one more mutation, Measles will have evolved from Weasels. Yet another example, of course, that Darwinian evolution can only produce degeneration – it cannot innovate.

I’d add a smiley but I hate those things.

Thanks, Mike. I’ve edited the page.

I checked most things I put up via Monte Carlo, but I apparently thought those were OK as they were.

Wesley R. Elsberry said:

Thanks, Mike. I’ve edited the page.

I checked most things I put up via Monte Carlo, but I apparently thought those were OK as they were.

One would think with all the discretionary time one has in retirement that it would be easy to find big enough blocks of time to work on it. But my schedule always seems to be full.

But playing around with this and some theory is now on my agenda. It has given me some other generalized ideas to pursue also. Good for the brain.

Mike,

None of the probability stuff was complex, but after a while of raising complements to powers it all gets prone to error. Obviously I was prone to error to begin with, but at least for the stuff related to the “locking/latching” issue I was really cautious and made sure that the calculation and the simulation agreed over a range of parameters. At least I think I did… let me know if you see anything else dodgy.

A noble click embiggens the smallest thumbnail?

I’m teaching myself python.

Has anyone tried to write a weasel program in python?

Sure, but then the python ate the weasel, which pretty much ended the experiment.

Henry

learnignpython said:

I’m teaching myself python.

Has anyone tried to write a weasel program in python?

And here. Dave

Dave Thomas said:

learnignpython said:

I’m teaching myself python.

Has anyone tried to write a weasel program in python?

And here. Dave

Oops; my apologies, Dave. Your program was especially pretty.

See Wesley Elsberries superb Java Weasel as an example!

Aaa!

1. s/Elsberries/Elsberry\’s/g

2. s/superb Java Weasel/superb JavaScript Weasel/g

Java is not related to JavaScript.

/Aaa!

Anyone got it in Ruby?

Your argument that a locking version is much harder to program is nonsense. Just check the character against he target string before you allow mutation. If it already matches then do not mutate. A single line test.

Though all of the programs seem to implement the same fitness function (number of correct letters), I’d like to stress the fact that Dawkins’s program would work with other functions as well. In his algorithm, he just states: The computer examines the mutant nonsense phrases [..] and chooses the one which, however slightly, most resembles the target phrase. These resembling could mean:

  1. as above, the number of correct letters
  2. the length of the longest correct substring
  3. the number of correct vowels times the number of correct consonants
  4. how many letters are correct at the beginning of the string
  5. etc.

The fitness function could work as an oracle: you send the strings of a generation to it and it delivers back (one of) the best string(s) (here the problem arises of stopping the program). But this approach

  • emphasizes the exchange of information between the search algorithm and the search space - something in which Dembski should be interested
  • helps to get rid of the perception that the computer searches something it already knows

Brian Macker said:

Your argument that a locking version is much harder to program is nonsense. Just check the character against he target string before you allow mutation. If it already matches then do not mutate. A single line test.

Yes, that’s easy: but you introduce your knowledge of the target phrase for a second time - not only for calculating the fitness…

Henry J said:

“Errors in genetic transcription are ensured by the Second Law. If you lock certain letters, you are saying the SLoT is suspended for them, but not for others.”

What the heck do they think locking has to do with the SLoT? As far as I know, SLoT doesn’t prevent something from staying the same, outside of temperature shifts if it wasn’t in equilibrium. But a change of a letter isn’t a change in temperature.

Henry

Late to the party again. “Latching” could also represent a survival attribute which has become crucial… like the ability to produce amylase in saliva. :-p

About this Entry

This page contains a single entry by Ian Musgrave published on March 19, 2009 5:40 AM.

More on The Homeschooling Decision was the previous entry in this blog.

Help Isis help undergrads is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.

Categories

Archives

Author Archives

Powered by Movable Type 4.381

Site Meter