Rumors of pseudogenes' demise greatly exaggerated, new study says

I imagine that reading scientific journals is mostly a drag for ID advocates: all those papers highlighting evolutionary mechanisms, identifying transitional fossils, veryfing phylogenetic-based predictions must be really irritating. There are, however, few and far between papers that set the ID advocates’ hearts all aflutter, so when one appears, they make sure to milk it for all its P.R. worth. One of the recent examples was a 2003 paper by Hirotsune and colleagues in Nature, which reported that alteration of the pattern of expression of a purported mouse pseudogene (i.e. an apparently inactivated, non-functional gene, part of the so-called “junk DNA”) results unexpectedly in the modification of the activity of its functional counterpart, leading to a series of dramatic developmental defects.

Now, one of the recurrent claims of Intelligent Design is that most if not all of the features of any organism, including its entire genome, should somehow be useful. Thus, ID advocates enthusiastically latched on Hirotsune’s publication as supporting their claim that “junk DNA” is, in fact, not junk at all (indeed, if you Google for “Hirotsune” and “Intelligent design”, you will find over 100 hits). Alas for them, a brand new paper by Todd Gray and colleagues in the Proceedings of the National Academy of Sciences appears to completely refute the original findings.

As usual, let me give the background first. Pseudogenes can be defined as the remnants of once-coding DNA sequences that have undergone a more or less significant loss of their ability to encode for any product. There are two main sources of pseudogene formation (see figure 1). The first is traditional gene duplication, followed by the loss of functionality by one of the duplicated members. The second is by a mechanism called retrotransposition - in short, the messenger RNA of a gene, instead of being used for protein synthesis, is transcribed back into a DNA sequence and inserted into the genome, forming so-called “retrogenes”. Unlike pseudogenes arising from gene duplication, which often retain the original regulatory elements required for gene expression (promoters/enhancers), retrogenes lack regulatory sequences, and therefore their only chance of becoming expressed is when they integrate in proximity of some other gene’s promoter (rather unlikely, although not impossible) [3,4]. If they don’t become expressed and aquire some selectable function, they quickly degenerate, turning into retrotransposed or “processed” pseudogenes. A third, more rare instance is that of functional genes losing their function in certain lineages, but not others (“unitary” pseudogenes - a classic example is the independent inactivation of the gene for vitamin C synthesis in primates and guinea pigs). retrogenes-jpg.JPG Figure 1: Origins of pseudogenes: A. Retrotransposed pseudogenes: starting from the original gene (the coding sequences are in black, the non-coding introns in gray, and the promoter element is indicated by the large arrow upstream of the gene), transcription generates a primary mRNA (black and gray broken line), from which the introns are excised by RNA splicing. This mature mRNA, which contains only exons and a poly-adenosine tail, is transcribed back into DNA by enzymes called reverse transcriptases, and the DNA is reinserted back into the genome. Hence, the pseudogene product will lack intron and promoter sequences, and will bear characteristic repeat sequences at the insertion site, due to the integration mechanism. B. Duplicated pseudogenes: DNA duplication generates a more-or-less faithful copy of the original gene, including introns and, in many cases, promoter and other transcriptional regulatory elements. In most cases, this duplicated gene will undergo crippling, inactivating mutations and turn into a pseudogene (in rarer cases, the duplicated copy will acquire new functions and become a new gene). (Adapted from [3].)

Thus, pseudogenes - and especially retrotransposed pseudogenes - are generally considered to be non-functional relics and, together with other sorts of repetitive and “selfish” DNA elements, as well as other unique DNA sequences, form the so-called “junk DNA”. (For a more general discussion of “junk DNA”, see Ian Musgrave’s discussion here at PT.) Indeed, when the pseudogenes can be followed over evolutionary lineages, they appear to evolve neutrally, accumulating mutations progressively and freely until they become almost unrecognizable, or disappear from the genome altogether. Note that the number of pseudogenes in the human genome (20,000 or so at the latest count, many of them crippled viral elements) is comparable to that of our functional genes - an impressive amount.

Despite its connotations, the phrase “junk DNA” (originated by Susumu Ohno in 1972) does not intend to convey an absolute and irreversible lack of function. Indeed, as it is often noted, had that been the case “garbage DNA” would have been a better term. In fact, “junk” is what accumulates in people’s basements and attics, not immediately useful but not nasty or burdensome enough to be quickly discarded – indeed, something that may occasionally be found to be of use (at least, that’s what I tell my wife). Another problem with the term is that it is unfortunately often misused (in the lay press and especially by Creationists, although some scientists are guilty as well) to simply denote DNA that does not directly encode any protein sequence - which is absolutely wrong. It has long been known, in some cases even before the term was coined, that DNA contains important non-coding elements involved in gene transcription (e.g. the promoter and enhancer elements mentioned above), RNA splicing and polyadenylation, chromosome dynamics, etc. In addition, instances exist where the sequence of a particular stretch of DNA is irrelevant, but its presence may be important, as in the case of introns, certain “spacer’ regions, and so on. Still, while it is clear that the term “junk DNA” should be used advisedly (if at all) there are good reasons to think that large swaths of the genome of most eukaryotic organisms are indeed non-functional, in part because these stretches of DNA accumulate mutations neutrally, and diverge much faster than known functional elements, and also because vast differences in DNA amount, presence of large duplications/deletions of intergenic regions, as well as gain and loss of specific pseudogenes are often observed in closely related organisms (see again Ian’s piece).

Ironically, the term “junk DNA” was originally often used as pretty much a swipe by supporters of the “neutral theory” (which argues that much of what goes on in the genome is actually non-adaptive and not subject to selective forces) against the more strictly neo-Darwinian “pan-adaptationists”, for whom the long hand of natural selection reaches every nook and cranny of an organism’s phenotype and genotype, continually getting rid of even mildly noxious or just useless features, like some obsessive-compulsive cleaner. With time, however, the accumulating evidence for selective neutrality of large parts of the genome convinced even the more strict Darwinians, and only Creationists and ID supporters have remained to argue the ultra-Darwinian pan-adaptationist position (although of course from different premises).

ID advocates in particular have often discussed junk DNA as an important issue for their “theory”. For instance, in his quasi-peer-reviewed paper in the Proceedings of the Biological Society of Washington, Stephen Meyer claimed that

Advocates of the design hypotheses on the other hand, would have predicted that non-coding regions of the genome might well reveal hidden functions, not only because design theorists do not think that new genetic information arises by a trial and error process of mutation and selection, but also because designed systems are often functionally polyvalent. (Meyer SC, Proc Biol Soc Wash 117:213-239. 2004)

Jonathan Wells wrote

From an ID perspective, however, it is extremely unlikely that an organism would expend its resources on preserving and transmitting so much “junk”. (Wells J, PCID, Vol 3.1, 2004) [Note here the similarity to the ultra-Darwinian argument: carrying “junk DNA” looks like a waste of resources, hence maladaptive for the pan-adaptationist, or “bad engineering” for the ID advocate. AB]

which prompted the Discovery Institute’s Casey Luskin to boldly venture (while lawyerly hedging his bets) that ID predicts that

Much so-called “junk DNA” will turn out to perform valuable functions.

ID connoisseurs would perceptively note here that these strong statements about “junk DNA”’s function seem to contrast with the shyness of ID advocates regarding the attributes, goal and identity of the Designer (why couldn’t She be a compulsive junk collector, as far as Her critters’ genomes go, after all?). But ID advocates have of course at least two good reasons to support this view: first, the admission of a careless or incompetent Designer would constitute a bad P.R. move with the movement’s religious supporters, and second, if uniquely identifiable junk DNA elements, like pseudogenes, are indeed non-functional, then their observed transmission along evolutionary lines (see for instance this paper) is powerful evidence of common descent, with which most ID advocates have yet to make peace.

Given this context, Hirotsune’s paper in 2003 must have seemed like a dream come true. The paper’s findings were actually serendipitous, but totally striking: while generating a transgenic mouse strain (i.e. a strain of mice in which an artificial gene has been added to the genome, in order to study its effects), Hirotsune realized that one of their strains presented an unusual phenotype in which various developmental defects in the kidneys and bones presented themselves at very high frequency, but only when the transgene was inherited from the father (genes that are expressed differentially when they are inherited maternally or paternally are called “imprinted”). Looking at the genomic region in which the transgene was inserted, the authors found a processed pseudogene (Makorin-1p1, or Mkrn1-p1 in genetic notation) with similarity to an expressed, functional gene called Mkrn1.

By further characterizing the phenomenon at the molecular level, the authors claimed that the Mkrn1-p1 pseudogene showed transcription into RNA only when inherited paternally (i.e., it was itself imprinted), that this expression was diminished by insertion of the transgene in its proximity, that the Mkrn1-p1 RNA product could regulate expression of the Mkrn1 functional gene by affecting stability of its mRNA, and that the phenotype due to Mkrn1-p1 suppression could be rescued by enforcing expression of either Mkrn1 or Mkrn1-p1 RNA. In short, they claimed to have demonstrated that RNA from a processed pseudogene can play a regulatory role in the expression of its ancestral, protein-coding gene counterpart.

Now, this is without doubt a very interesting finding, although of limited general applicability (as I mentioned, the majority of retrogenes do not give rise to any RNA), and of course ID advocates jumped on it with a vengeance. In the wake of the Hirotsune paper, Mike Behe even submitted a letter to Nature, which declined publishing it. Among other things, the letter stated:

The modern molecular example of poor design is pseudogenes. Why litter a genome with useless, broken copies of functional genes? It looks just like the aftermath of a blind, wasteful process. No designer would have done it that way.(2) Yet Hirotsune et al (3) show that at least one pseudogene has a function. If at least some pseudogenes have unsuspected functions, however, might not other biological features that strike us as odd also have functions we have not yet discovered? Might even the backwards wiring of the vertebrate eye serve some useful purpose?

Cautionary Note: if you wish to read the entire text, please make sure to shut off your irony meters first: in the letter, Behe sternly warns the scientific community against the perils of purely negative argumentation, and chastises them for naively trusting their “intuition” regarding biological function - rather cheeky, for someone whose main arguments against evolutionary theory are that we don’t have a complete mutation-by-mutation model of the evolution of certain biological structures, and that

The strong appearance of design allows a disarmingly simple argument: if it looks, walks and quacks like a duck, then, absent compelling evidence to the contrary, we have warrant to conclude it’s a duck. (Michael Behe, “Design for Living”, The New York Times, 2/7/2005)

Anyway, back to the main story. Although ID advocates are keen to claim that the “darwinian orthodoxy” routinely suppresses or ignores inconvenient results, Hirotsune’s paper caused quite a splash, and rapidly accumulated over 100 citations in the scientific literature, most endorsing the new model of “pseudogene trans-regulation” (trans here is lingo for “acting on another chromosome”) proposed by the authors [e.g. 4,5]. Other scientists pursued the lead trying to investigate the conservation of the putative regulatory portion of Mkrn1-p1, albeit with mixed results [6, 7]. Still, there were some reasons for skepticism – for instance, the Mkrn1-p1 retrogene, despite its purported crucial role in mice, is absent from the genome of all other mammals tested, including closely related rats. In addition, the Makorin gene family counts at least 3 functional members in the mouse (including, of interest, a bona fide functional retrogene, which is transcribed, conserved in mammals, and encodes the Makorin-3 protein), as well as numerous pseudogenes, which can complicate molecular and sequence analysis. Finally, previous studies on mice with chromosomal alterations including the segment near the Mkrn1-p1 gene suggested that no imprinted gene with deleterious effects was present in the region.

Enter Gray and colleagues, which in their PNAS paper systematically re-analyzed the original story and tested some of its predictions. Their findings consistently contradicted the Nature paper’s conclusions: they found that the Mkrn1-p1 pseudogene is not transcribed at all, and that the RNA attributed to the pseudogene by Hirotsune is actually a variant form of transcripts from the functional gene; that the pseudogene’s DNA is extensively modified by methylation, a known hallmark of transcriptional inactivity; that neither Mkrn1 nor Mkrn1-p1 are imprinted; and finally that inactivation of the functional Mkrn1 gene does not bring about the changes observed in Hirotsune’s transgenic mice. Gray therefore concludes that Hirotsune’s data were mostly artifactual, and (quite generously) propose some alternative mechanisms of how those findings came to be originated.

Where does this leave us with regard to pseudogenes? Actually, pretty much where we were before the Gray paper came out. If you take away the hype and ignore the wishful thinking of ID supporters, the evidence still overwhelmingly supports the notion that many, likely most pseudogenes are functionless, and it does so regardless of the validity of Hirotsune’s findings. Indeed, if one assumes that evolutionary conservation of DNA sequences is a strong hallmark of potential function, then a recent study by a Swedish group shows that at best a few dozens of the thousands of pseudogenes in the human and mouse genomes are under sufficient selective pressure to be highly conserved between the two lineages, suggesting they may be functional [8]. Still, there is ample room for potential interesting mechanisms by which pseudogenes can on occasion be recruited into regulatory and structural functions.

There is, of course, also an important lesson about science here: Hirotsune’s provocative, out-of-the-mainstream findings were not rejected on principle, but were given wide exposure, embraced by some as explanatory of certain processes, put to the test by others, and invalidated. Of course, this will apply to Gray’s data as well – it is now up to Hirotsune and his supporters to test the new findings and explain them away, or accept them. Stay tuned.

Acknowledgements Thanks to Ian, Nick, Douglas, Reed, Dunk Erik and the rest of the PT crew for useful comments and suggestions.


  1. Hirotsune S, Yoshida N, Chen A, Garrett L, Sugiyama F, Takahashi S, Yagami K, Wynshaw-Boris A, Yoshiki A. An expressed pseudogene regulates the messenger-RNA stability of its homologous coding gene. Nature. 2003 423:91-6.

  2. Gray TA, Wilson A, Fortin PJ, Nicholls RD. The putatively functional Mkrn1-p1 pseudogene is neither expressed nor imprinted, nor does it regulate its source gene in trans. Proc Natl Acad Sci USA. 2006 Aug 1; [Epub ahead of print]

  3. D’Errico I, Gadaleta G, Saccone C. Pseudogenes in metazoa: origin and features. Brief Funct Genomic Proteomic. 2004 3:157-67.

  4. Balakirev ES, Ayala FJ. Pseudogenes: are they “junk” or functional DNA? Annu Rev Genet. 2003;37:123-51.

  5. Zhang Z, Gerstein M. Large-scale analysis of pseudogenes in the human genome. Curr Opin Genet Dev. 2004 14:328-35.

  6. Podlaha O, Zhang J. Nonneutral evolution of the transcribed pseudogene Makorin1-p1 in mice. Mol Biol Evol. 2004 21:2202-9.

  7. Kaneko S, Aki I, Tsuda K, Mekada K, Moriwaki K, Takahata N, Satta Y. Origin and evolution of processed pseudogenes that stabilize functional Makorin1 mRNAs in mice, primates and other mammals. Genetics. 2006 172:2421-9.

  8. Svensson O, Arvestad L, Lagergren J. Genome-wide survey for biologically functional pseudogenes. PLoS Comput Biol. 2006 2:e46.