Gene duplication versus ID

Gene duplication is mentioned as one example in which information and complexity in the genome can increase leading to biochemical novelty and even irreducibly complex systems. Despite this ID proponents object to gene duplication as a relevant mechanism. I will explore some of the objections and show how science has and is addressing these objections. I intend to show that the objections raised by ID proponents are mostly without merit.


Behe mentions in this 2003 interview that

My current work is an attempt to model the evolution of new protein functions through gene duplication. Gene duplication is purported to be a major pathway for the Darwinian evolution of biochemical novelty. However, as in other areas, Darwinists have not closely examined whether gene duplication can realistically do all that they ascribe to it. I hope to help them out in this area by asking those questions.

It seems that Behe may not be familiar with the research on gene duplication when he states that

The hitch, as always, is that Darwinists virtually never explain in any detail how natural selection would actually get from protein A to protein B after the gene for protein A duplicated. After all, gene duplication just leaves you with a second copy of the same gene — nothing different. The problem is, as everyone agrees, that the duplicated gene is much more likely to suffer a deleterious mutation than a beneficial one. Nonetheless, Darwinists hope that the occasional beneficial mutation just might come along. However, they never look very deeply into the matter. It turns out that to acquire some new functions, such as the capacity to bind a new molecule, multiple mutations would be expected to be needed, not just a single mutation. The requirement for multiple mutations would quickly render gene duplication an untenable explanation, since a duplicated gene would be riddled with deleterious mutations before acquiring several positive ones.

The scientific evidence

Force, A., M. Lynch, F.B. Pickett, A. Amores, Y.-L. Yan, and J. Postlethwait. The preservation of duplicate genes by complementary degenerative mutations. Genetics 151:1531-1545. 1999.

Gene duplication is commonly given as the explanation for the increase in complexity via the acquisition of new functions. This paper addresses the standard scenario of duplication followed by either an adaptive mutation leading to the preservation of both genes or followed by degeneration of one of the copies. Since detrimental mutations are more likely than benificial mutations, the classical model predict that one of the duplicated genes will become a psuedogene. Actual data seems to indicate that the number of functional copies is larger than expected from the classical model and the authors present an interesting alternative. The alternative explains duplicate gene preservation by the fixation of a degenerative mutation rather than a more rare benificial mutations. The authors also present data from the Zebrafish consistent with this new model.

ABSTRACT The origin of organismal complexity is generally thought to be tightly coupled to the evolution of new gene functions arising subsequent to gene duplication. Under the classical model for the evolution of duplicate genes, one member of the duplicated pair usually degenerates within a few million years by accumulating deleterious mutations, while the other duplicate retains the original function. This model further predicts that on rare occasions, one duplicate may acquire a new adaptive function, resulting in the preservation of both members of the pair, one with the new function and the other retaining the old. However, empirical data suggest that a much greater proportion of gene duplicates is preserved than predicted by the classical model. Here we present a new conceptual framework for understanding the evolution of duplicate genes that may help explain this conundrum. Focusing on the regulatory complexity of eukaryotic genes, we show how complementary degenerative mutations in different regulatory elements of duplicated genes can facilitate the preservation of both duplicates, thereby increasing long-term opportunities for the evolution of new gene functions. The duplication-degeneration-complementation (DDC) model predicts that (1) degenerative mutations in regulatory elements can increase rather than reduce the probability of duplicate gene preservtion and (2) the usual mechanism of duplicate gene preservation is the partitioning of ancestral functions rather than the evolution of new functions. We present several examples (including analysis of a new engrailed gene in zebrafish) that appear to be consistent with the DDC model, and we suggest several analytical and experimental approaches for determining whether the complementary loss of gene subfunctions or the acquisition of novel functions are likely to be the primary mechanisms for the preservation of gene duplicates.

The authors distinguish between nonfunctionalization of the duplicate gene, neofunctionalization or subfunctionalization. Subfunctionalization happens when genes are preserved by complementary degenerative mutations.

Figure 1. Three potential fates of duplicate gene pairs with multiple regulatory regions. The small boxes denote regulatory elements with unique functions, and the large boxes denote transcribed regions. Solid boxes denote intact regions of a gene, while open boxes denote null mutations, and triangles denote the evolution of a new function. Because the model focuses on mutations fixed in populations, the diagram shows the state of a single gamete. In the first two steps, one of the copies acquires null mutations in each of two regulatory regions. On the left, the next fixed mutation results in the absence of a functional protein product from the upper copy. Because this gene is now a nonfunctional pseudogene, the remaining regulatory regions associated with this copy eventually accumulate degenerative mutations. On the right, the lower copy acquires a null mutation in a regulatory region that is intact in the upper copy. Because both copies are now essential for complete gene expression, this third mutational event permanently preserves both members of the gene pair from future nonfunctionalization. The fourth regulatory region, however, may still eventually acquire a null mutation in one copy or the other. In the center, a regulatory region acquires a new function that preserves that copy. If the beneficial mutation occurs at the expense of an otherwise essential function, then the duplicate copy is preserved because it retains the original function.

See also

Lynch, M., and A. Force. The probability of duplicate-gene preservation by subfunctionalization. Genetics 154: 459-473. 2000

Lynch, M., and A. Force. Gene duplication and the origin of interspecific genomic incompatibility. American Naturalist 156: 590-605. 2000.

Michael Lynch, Martin O’Hely, Bruce Walsh, and Allan Force. The probability of preservation of a newly arisen gene duplicate. Genetics 2001 159: 1789-1804.

Force, A. G., Cresko, W. A., and F. B. Pickett. Infomational accretion, gene duplication, and the mechanisms of genetic module parcellation. In Modularity in Development and Evolution (in press), G. Schlosser and G. Wagner. 2002

Examples of gene duplication and innovative functions

antifreeze protein

Figure 4. Likely mechanism by which an ancestral trypsinogen gene was transformed into an AFGP gene. The 5 end (E1, I1, and small segment of E2) and the 3 end (I5 3 splice site and E6) of trypsinogen gene were recruited and linked, and the remainder of the gene deleted (dashed lines and boxes). The Thr-Ala-Ala coding element was duplicated, presumably via slippage at the repetitive (gt)n sequence during replication. The recruited E1 provided the 5 UTR and signal peptide sequences for the new AFGP gene. The deletion, linking, and amplification events led to a 1-nt frameshift resulting in a termination codon (tga) at the start of the recruited trypsinogen E6 and converting it into the 3 flanking sequence of the AFGP gene. The spacer sequence (bars filled with zigzagged lines) and additional I1 sequence might be existing sequence in the trypsinogen progenitor gene or acquired through recombinatory events. The Thr-Ala-Ala coding duplicants plus a spacer became amplified de novo to form the new AFGP polyprotein coding region. The regions of identity are illustrated as in Fig. 1. Splice sites in trypsinogen gene are given in italics.

“Origin of antifreeze protein genes: A cool tale in molecular evolution”, John M. Logsdon Jr. and W. Ford Doolittle Proc. Natl. Acad. Sci. USA Vol. 94, pp. 3485-3487, April 1997

“Evolution of anti-freeze glycoprotein from a trypsinogen gene in Antarctic notothenoid fish”, Chen L, DeVries AL, Cheng CC, Proceedings of the National Academy of Science 94:3811-16, April 1997

Abstract: Freezing avoidance conferred by different types of antifreeze proteins in various polar and subpolar fishes represents a remarkable example of cold adaptation, but how these unique proteins arose is unknown. We have found that the antifreeze glycoproteins (AFGPs) of the predominant Antarctic fish taxon, the notothenioids, evolved from a pancreatic trypsinogen. We have determined the likely evolutionary process by which this occurred through characterization and analyses of notothenioid AFGP and trypsinogen genes. The primordial AFGP gene apparently arose through recruitment of the 5 and 3 ends of an ancestral trypsinogen gene, which provided the secretory signal and the 3 untranslated region, respectively, plus de novo amplification of a 9-nt Thr-Ala-Ala coding element from the trypsinogen progenitor to create a new protein coding region for the repetitive tripeptide backbone of the antifreeze protein. The small sequence divergence (4-7%) between notothenioid AFGP and trypsinogen genes indicates that the transformation of the proteinase gene into the novel ice-binding protein gene occurred quite recently, about 5-14 million years ago (mya), which is highly consistent with the estimated times of the freezing of the Antarctic Ocean at 10-14 mya, and of the main phyletic divergence of the AFGP-bearing notothenioid families at 7-15 mya. The notothenioid trypsinogen to AFGP conversion is the first clear example of how an old protein gene spawned a new gene for an entirely new protein with a new function. It also represents a rare instance in which protein evolution, organismal adaptation, and environmental conditions can be linked directly.

“Functional Antifreeze Glycoprotein Genes in Temperate-Water New Zealand Nototheniid Fish Infer an Antarctic Evolutionary Origin”, Chi-Hing C. Cheng, Liangbiao Chen, Thomas J. Near, and Yumi Jin, Mol. Biol. Evol. 20(11):1897-1908. 2003

Abstract:The fish fauna of the Antarctic Ocean is dominated by five endemic families of the Perciform suborder Notothenioidei, thought to have arisen in situ within the Antarctic through adaptive radiation of an ancestral stock that evolved antifreeze glycoproteins (AFGPs) enabling survival as the ocean chilled to subzero temperatures. The endemism results from geographic confinement imposed by a massive oceanographic barrier, the Antarctic Circumpolar Current, which also thermally isolated Antarctica over geologic time, leading to its current frigid condition. Despite this voluminous barrier to fish dispersal, a number of species from the Antarctic family Nototheniidae now inhabit the nonfreezing cool temperate coasts of the southern continents. The origin of these temperate-water nototheniids is not completely understood. Since the AFGP gene apparently evolved only once, before the Antarctic notothenioid radiation, the presence of AFGP genes in extant temperate-water nototheniids can be used to infer an Antarctic evolutionary origin. Genomic Southern analysis, PCR amplification of AFGP genes, and sequencing showed that Notothenia angustata and Notothenia microlepidota endemic to southern New Zealand have two to three AFGP genes, structurally the same as those of the Antarctic nototheniids. At least one of these genes is still functional, as AFGP cDNAs were obtained and low levels of mature AFGPs were detected in the blood. A phylogenetic tree based on complete ND2 coding sequences showed monophyly of these two New Zealand nototheniids and their inclusion in the monophyletic Nototheniidae consisted of mostly AFGP-bearing taxa. These analyses support an Antarctic ancestry for the New Zealand nototheniids. A divergence time of approximately 11 Myr was estimated for the two New Zealand nototheniids, approximating the upper Miocene northern advance of the Antarctic Convergence over New Zealand, which might have served as the vicariant event that lead to the northward dispersal of their most recent common ancestor. Similar secondary northward dispersal likely applies to the South American nototheniid Paranotothenia magellanica, which has four AFGP genes in its DNA, but not to the sympatric nototheniid Patagonotothen tessellata, which does not appear to have any AFGP sequences in its genome at all.

Caenorhabditis elegans

“The Structure and Early Evolution of Recently Arisen Gene Duplicates in the Caenorhabditis elegans Genome “, Vaishali Katju and Michael Lynch, Genetics, Vol. 165, 1793-1803, December 2003

Abstract: The significance of gene duplication in provisioning raw materials for the evolution of genomic diversity is widely recognized, but the early evolutionary dynamics of duplicate genes remain obscure. To elucidate the structural characteristics of newly arisen gene duplicates at infancy and their subsequent evolutionary properties, we analyzed gene pairs with 10% divergence at synonymous sites within the genome of Caenorhabditis elegans. Structural heterogeneity between duplicate copies is present very early in their evolutionary history and is maintained over longer evolutionary timescales, suggesting that duplications across gene boundaries in conjunction with shuffling events have at least as much potential to contribute to long-term evolution as do fully redundant (complete) duplicates. The median duplication span of 1.4 kb falls short of the average gene length in C. elegans (2.5 kb), suggesting that partial gene duplications are frequent. Most gene duplicates reside close to the parent copy at inception, often as tandem inverted loci, and appear to disperse in the genome as they age, as a result of reduced survivorship of duplicates located in proximity to the ancestral copy. We propose that illegitimate recombination events leading to inverted duplications play a disproportionately large role in gene duplication within this genome in comparison with other mechanisms.

Tubulin genes

“Evolution, Organization, and Expression of -Tubulin Genes in the Antarctic Fish Notothenia coriiceps: ADAPTIVE EXPANSION OF A GENE FAMILY BY RECENT GENE DUPLICATION, INVERSION, AND DIVERGENCE “, Sandra K. Parker and H. William Detrich III, J Biol Chem, Vol. 273, Issue 51, 34358-34369, December 18, 1998

To assess the organization and expression of tubulin genes in ectothermic vertebrates, we have chosen the Antarctic yellowbelly rockcod, Notothenia coriiceps, as a model system. The genome of N. coriiceps contains ~15 distinct DNA fragments complementary to -tubulin cDNA probes, which suggests that the -tubulins of this cold-adapted fish are encoded by a substantial multigene family. From an N. coriiceps testicular DNA library, we isolated a 13.8-kilobase pair genomic clone that contains a tightly linked cluster of three -tubulin genes, designated NcGTba, NcGTbb, and NcGTbc. Two of these genes, NcGTba and NcGTbb, are linked in head-to-head (5’ to 5’) orientation with ~500 bp separating their start codons, whereas NcGTba and NcGTbc are linked tail-to-tail (3’ to 3’) with ~2.5 kilobase pairs between their stop codons. The exons, introns, and untranslated regions of the three -tubulin genes are strikingly similar in sequence, and the intergenic region between the a and b genes is significantly palindromic. Thus, this cluster probably evolved by duplication, inversion, and divergence of a common ancestral -tubulin gene. Expression of the NcGTbc gene is cosmopolitan, with its mRNA most abundant in hematopoietic, neural, and testicular tissues, whereas NcGTba and NcGTbb transcripts accumulate primarily in brain. The differential expression of the three genes is consistent with distinct suites of putative promoter and enhancer elements. We propose that cold adaptation of the microtubule system of Antarctic fishes is based in part on expansion of the - and -tubulin gene families to ensure efficient synthesis of tubulin polypeptides.

“Tandem sequence duplications functionally complement deletions in the D1 protein of Photosystem II”, Kless H, Vermaas W, J Biol Chem 270(28): 16536-165451, July 1995

Obligate photoheterotrophic mutants of the cyanobacterium Synechocystis sp. PCC 6803 that carry deletions of conserved residues in the plastoquinone-binding niche of the D1 protein were used to select for spontaneous mutations that restore photoautotrophic growth. Spontaneous pseudorevertants emerged from two deletion mutants, YNIV and NN, when the cultures were maintained long after the carbon source (glucose) had been depleted from the medium and cells had reached stationary phase. Most pseudorevertants were found to contain tandem duplications of 6-45-base pair DNA sequences located close to the domain carrying the deletion; none of them restored the wild-type sequence. Three pseudorevertants isolated from the YNIV mutant contained a duplication (7-15 codons) of the DNA sequence immediately downstream of the deletion; the protein region encoded by this DNA may include part of the putative de helix, an important constituent of the plastoquinone-binding niche. Three pseudorevertants isolated from the NN mutant contained duplications corresponding to 2-8 amino acid residues adjacent to the site of the deletion. In all six pseudorevertants carrying duplications, the length of the D1 protein in the modified regions was restored to at least the length present in wild type, suggesting that a minimal length of these protein domains may be required for functional integrity. In another photoautotrophic strain isolated from NN, no secondary mutations could be identified in the gene coding for the D1 protein; such mutations apparently reside on another protein subunit of the photosystem II complex. Photosystem II function in the pseudorevertants was altered as compared with wild type in terms of growth and oxygen evolution rates, photosystem II concentration, the semiquinone equilibrium at the acceptor side, and thermostability. A mechanism leading to tandem sequence duplication may involve DNA damage followed by DNA synthesis, strand displacement, and ligation.

“Transposable elements are found in a large number of human protein-coding genes”, Nekrutenko A, Li W-H, Trends in Genetics 17(11):619-621 Nov ‘01

To study the genome-wide impact of transposable elements (TEs) on the evolution of protein-coding regions, we examined 13 799 human genes and found 533 (approximately 4%) cases of TEs within protein-coding regions. The majority of these TEs (approximately 89.5%) reside within ‘introns’ and were recruited into coding regions as novel exons. We found that TE integration often has an effect on gene function. In particular, there were two mouse genes whose coding regions consist largely of TEs, suggesting that TE insertion might create new genes. Thus, there is increasing evidence for an important role of TEs in gene evolution. Because many TEs are taxon-specific, their integration into coding regions could accelerate species divergence.

“Positive Darwinian selection after gene duplication in primate ribonuclease genes”, Zhang J, Rosenberg HF, Nei M, PNAS 95: 3708-3713, Mar ‘98

Evolutionary mechanisms of origins of new gene function have been a subject of long-standing debate. Here we report a convincing case in which positive Darwinian selection operated at the molecular level during the evolution of novel function by gene duplication. The genes for eosinophil cationic protein (ECP) and eosinophil-derived neurotoxin (EDN) in primates belong to the ribonuclease gene family, and the ECP gene, whose product has an anti-pathogen function not displayed by EDN, was generated by duplication of the EDN gene about 31 million years ago. Using inferred nucleotide sequences of ancestral organisms, we showed that the rate of nonsynonymous nucleotide substitution was significantly higher than that of synonymous substitution for the ECP gene. This strongly suggests that positive Darwinian selection operated in the early stage of evolution of the ECP gene. It was also found that the number of arginine residues increased substantially in a short period of evolutionary time after gene duplication, and these amino acid changes probably produced the novel anti-pathogen function of ECP.

“Adaptive evolution of a duplicated pancreatic ribonuclease gene in a leaf-eating monkey”, Zhang J, Zhang Y-P, Rosenberg HF, Nature Genetics 30:411-415, April ‘02

Although the complete genome sequences of over 50 representative species have revealed the many duplicated genes in all three domains of life, the roles of gene duplication in organismal adaptation and biodiversity are poorly understood. In addition, the evolutionary forces behind the functional divergence of duplicated genes are often unknown, leading to disagreement on the relative importance of positive Darwinian selection versus relaxation of functional constraints in this process. The methodology of earlier studies relied largely on DNA sequence analysis but lacked functional assays of duplicated genes, frequently generating contentious results. Here we use both computational and experimental approaches to address these questions in a study of the pancreatic ribonuclease gene (RNASE1) and its duplicate gene (RNASE1B) in a leaf-eating colobine monkey, douc langur. We show that RNASE1B has evolved rapidly under positive selection for enhanced ribonucleolytic activity in an altered microenvironment, a response to increased demands for the enzyme for digesting bacterial RNA. At the same time, the ability to degrade double-stranded RNA, a non-digestive activity characteristic of primate RNASE1, has been lost in RNASE1B, indicating functional specialization and relaxation of purifying selection. Our findings demonstrate the contribution of gene duplication to organismal adaptation and show the power of combining sequence analysis and functional assays in delineating the molecular basis of adaptive evolution.

“Origin of new genes and source for N-terminal domain of the chimerical gene, jingwei, in Drosophila”, Long M, Wang W, Zhang J, Gene 238: 135-141, Sep 99

This paper deals with a general question posed by the origin of new processed chimerical genes: when a new retrosequence inserts into a new genome position, how does it become activated and acquire novel protein function by recruiting new functional domains and regulatory elements? Jingwei (jgw), a newly evolved functional gene with a chimerical structure in Drosophila, provides an opportunity to examine such questions. The source of its exon encoding C-terminal peptide has been identified as an Adh retrosequence, which extends the concept of exon shuffling from recombination to retroposition as a general molecular mechanism for the origin of a new gene. However, the origin of 5’ exons remains unclear. We examined two hypotheses concerning the origin of these non-Adh-derived jgw exons: (i) these exons might originate from a unique genomic sequence that fortuitously evolved a standard intron-exon structure and regulatory sequence for jgw; (ii) these exons might be a duplicate of an unrelated previously existing gene. Genomic Southern analysis, in conjunction with construction and screening of a genomic bookshelf (sub-library), was conducted in a group of Drosophila species. The results demonstrated that there are duplicate genes containing the same structure as the recruited portion of jgw. We name this duplicate gene in Drosophila teissieri and Drosophila yakuba and its orthologous gene in Drosophila melanogaster as yellow-emperor (ymp). Thus, the 5’ exons/introns originated from a previously existing gene that provided new modules with specific sub-function to create jgw.

Links adapted from Here

Scale free networks

In addition, protein networks, RNA networks can be characterized by a ‘scale free’ nature. Remarkably a simple model involving gene duplication can explain the nature of these networks.

“Scale free” was a term first coined by Albert-László Barabási

Powerlaw website


Barabási is a professor of physics and director of the Study of Self-Organized Networks at Notre Dame.

See also their Cellular networks publication page.

Barabasi, A. and Albert, R. Emergence of scaling in random networks. Science 286, 509-512. 1999.

Barabasi, A. and Bonabeau, E. Scale-Free Networks. Scientific American 288, 60-69. 2003.

Various papers explore how simple models based on gene duplication can lead to networks with similar statistics as found in nature.

Bhan A, Galas DJ, Dewey TG. A duplication growth model of gene expression networks. Bioinformatics. 2002 Nov;18(11):1486-93.

The overall structure of these biological networks is distinctly different from that of other recently studied networks such as the Internet or social networks. These biological networks show hierarchical, hub-like structures that have some properties similar to a class of graphs known as small world graphs. Small world networks exhibit local cliquishness while exhibiting strong global connectivity. In addition to the small world properties, the biological networks show a power law or scale free distribution of connectivities. An inverse power law, N(k) approximately k(-3/2), for the number of vertices (genes) with k connections was observed for three different data sets from yeast. We propose network growth models based on gene duplication events. Simulations of these models yield networks with the same combination of global graphical properties that we inferred from the expression data.

V. van Noort, B. Snel, and M. A. Huynen The yeast coexpression network has a small-world, scale-free architecture and can be explained by a simple model EMBO Rep., March 1, 2004; 5(3): 280 - 284.

Evolutionary model of transcription regulation. The evolutionary model consists of a few simple mechanisms. (A) A genome is initiated with 25 genes with random TFBSs, represented by the small coloured shapes. (B) Possible events are as follows: (1) Gene A is duplicated, gene A’ has the same TFBS as its duplicate gene A; the duplicates are coexpressed. (2) Gene deletion. (3) Gene A acquires a new TFBS from gene B. The probability of obtaining a specific TFBS is proportional to its frequency in the genome. The probability of a novel TFBS is (150 - total number of different TFBSs present)/(150+total number of TFBSs). (4) One of the TFBSs of gene A is deleted. (C) A network is constructed by connecting genes that share TFBSs.

Additionally such scale free networks can help explain modularity, robustness, evolvability as well as degeneracy found in nature.

For instance:

Modularity “for free” in genome architecture? by Ricard V. Sole and Pau Fernandez

Recent models of genome-proteome evolution have shown that some of the key traits displayed by the global structure of cellular networks might be a natural result of a duplication- diversification (DD) process. One of the consequences of such evolution is the emergence of a small world architecture together with a scale-free distribution of interactions. Here we show that the domain of parameter space where such structure emerges is related to a phase transition phenomenon. At this transition point, modular architecture spontaneously emerges as a byproduct of the DD process. Although DD models lack any functionality and are thus free from meeting functional constraints, they show the observed features displayed by the real proteome maps when tuned close to a sharp transition point separating a highly connected graph from a disconnected system. Close to such a boundary, the maps are shown to display scale-free hierarchical organization, behave as small worlds, and exhibit modularity. It is conjectured that natural selection tuned the average connectivity in such a way that the network reaches a sparse graph of connections. One consequence of such a scenario is that the scaling laws and the essential ingredients for building a modular net emerge for free close to such a transition.

Yaneer Bar-Yam and Irving R. EpsteinResponse of complex networks to stimuli PNAS March 30, 2004 vol. 101 no. 13 4341-4345

We consider the response of complex systems to stimuli and argue for the importance of both sensitivity, the possibility of large response to small stimuli, and robustness, the possibility of small response to large stimuli. Using a dynamic attractor network model for switching of patterns of behavior, we show that the scale-free topologies often found in nature enable more sensitive response to specific changes than do random networks. This property may be essential in networks where appropriate response to environmental change is critical and may, in such systems, be more important than features, such as connectivity, often used to characterize network topologies. Phenomenologically observed exponents for functional scale-free networks fall in a range corresponding to the onset of particularly high sensitivities, while still retaining robustness.

These data show how claims from ID proponents that (Darwinian) evolutionary mechanisms cannot explain information, complexity, innovation are without much merrit.

In this 2002 paper Lynch explores gene duplication and evolution based on the paper Jeffrey A. Bailey, Zhiping Gu, Royden A. Clark, Knut Reinert, Rhea V. Samonte, Stuart Schwartz, Mark D. Adams, Eugene W. Myers, Peter W. Li, and Evan E. Eichler Science 2002 297: 1003-1007.

Co-option, gene duplication appear to be quite important evolutionary mechanisms

Co-option occurs when natural selection finds new uses for existing traits, including genes, organs, and other body structures. Genes can be co-opted to generate developmental and physiological novelties by changing their patterns of regulation, by changing the functions of the proteins they encode, or both. This often involves gene duplication followed by specialization of the resulting paralogous genes into particular functions. A major role for gene co-option in the evolution of development has long been assumed, and many recent comparative developmental and genomic studies have lent support to this idea. Although there is relatively less known about the molecular basis of co-option events involving developmental pathways, much can be drawn from well-studied examples of the co-option of structural proteins. Here, we summarize several case studies of both structural gene and developmental genetic circuit co-option and discuss how co-option may underlie major episodes of adaptive change in multicellular organisms. We also examine the phenomenon of intraspecific variability in gene expression patterns, which we propose to be one form of material for the co-option process. We integrate this information with recent models of gene family evolution to provide a framework for understanding the origin of co-optive evolution and the mechanisms by which natural selection promotes evolutionary novelty by inventing new uses for the genetic toolkit

Gene co-option in physiological and morphological evolution. True JR, Carroll SB.Annu Rev Cell Dev Biol. 2002;18:53-80.


For example, it is becoming clear that co-option has played a critical role in evolution and the homeotic genes are not exempt in this regard. To demonstrate this point we can point to the expression pattern of the homeotic gene Ubx in various arthropod groups and the most basic morphology of the segments within its expression domains.

Understanding the genetic basis of morphological evolution: the role of homeotic genes in the diversification of the arthropod bauplan. ALEKSANDAR POPADIC, ARHAT ABZHANOV, DOUGLAS RUSCH and THOMAS C. KAUFMAN Int. J. Dev. Biol. 42: 453-461 (1998)


Like Dembski, Johnson is very fond of information theory. He is quite emphatic that natural selection acting on chance variations can not significantly increase the information content of the genome. Johnson offers a crude caricature of the arguments made in Dawkins’ article (41), but offers no explanation of why gene duplication with subsequent divergence can not account for the growth in genetic information. The closest he comes to addressing the subject is the following quote, in which he recounts a discussion with mathematical physicist Paul Davies:

“When I asked Davies about this, his reply gave me the impression that he thinks that natural selection increases genetic information by preserving copies that are made in the reproductive process. I am afraid this misses the point. When two rabbits reproduce there are more rabbits, but there is not any increase in information in the relevant sense. If you need to write out the full text of the encyclopedia and have only page one, you cannot make progress toward your goal by copying page one twenty times.” (59)

In reply I will simply quote John Maynard Smith and Eors Szathmary, from their book The Major Transitions in Evolution. The mere duplication of a gene adds no new information, but the divergence of the two copies does so.”

Design detectives

Creationist arguments

Some websites which seem to be unaware of the scientific data

Fourth, we see the apparent inability of mutations to truly contribute to the origin of new structures. The theory of gene duplication in its present form is unsuitable to account for the origin of new genetic information that is a must for any theory of evolutionary mechanism.

Rebuttals to Common Criticisms of the Book Darwin’s Black Box Robert DiSilvestro, Ph.D. (also found on various other websites)

Rebuttal to criticism # 2. To develop the specialized functions, the duplicated genes still had to evolve structural changes. What drove the changes? In all likelihood, a number of specializations would have had to develop simultaneously to have any value. This brings everything back to the mouse trap analogy. The only refinement is that some parts of the mouse trap would have some structural similarities.

An additional concern here is the high probability of the evolving genes messing up the original system. This is very likely with an abundance of structurally similar gene products. If one of these gene products becomes nonfunctional, it could get in the way of the function of original gene product. This phenomena is readily observed today. For instance, the chemotherapy drug methotrexate looks like the B-vitamin folacin, but does not work like it. The drug will compete with the real vitamin for binding to functional sites, but will not actually function. This action kills cancer cells.

Another problem with gene duplication is that it doesn’t account for all, or even most, of the complexity in many systems. For example, the complexities of oxygen transport involve many genes which are not structurally similar. This is obvious when one considers anemia, a breakdown in oxygen transport. When I teach nutrition courses, I sometimes ask: how many different mechanisms can cause anemia? There are many causes which involve molecules with little or no overlap in structure.

Di Silvestro’s ‘arguments’ are that not all complexity seems to have arisen through gene duplication but that just seems to be irrelevant since the argument was never that all complexity thus arises. Di Silvestro raised “An additional concern here is the high probability of the evolving genes messing up the original system. “ but provides few references to help understand the relevance of this claim other than a reference to a chemotherapy drug. It should be obvious that natural selection would quickly deal with such cases.