A little cis story

I found a recent paper in Nature fascinating, but why is hard to describe — you need to understand a fair amount of general molecular biology and development to see what's interesting about it. So those of you who already do may be a little bored with this explanation, because I've got to build it up slowly and hope I don't lose everyone else along the way. Patience! If you're a real smartie-pants, just jump ahead and read the original paper in Nature.

A little general background.


Let's begin with an abstract map of a small piece of a strand of DNA. This is a region of fly DNA that encodes a gene called svb/ovo (I'll explain what that is in a moment). In this map, the transcribed portions of the DNA are shown as gray shaded blocks; what that means is that an enzyme called polymerase will bind to the DNA at the start of those blocks and make a copy in the form of RNA, which will then enter the cytoplasm of the cell and be translated into a protein, which does some work in the activities of that cell. So svb/ovo is a small piece of DNA which, in the normal course of events, will make a protein.

Most of the DNA here is not transcribed. Much of it is junk — changing the sequence of those areas has no effect on the protein, and has no effect on the appearance or function of the organism. Some of it, though, is regulatory DNA, and its sequence does matter. The white boxes labeled DG2, DG3, Z, A, E, and 7 are regions called enhancers — they are not translated into protein, but their sequence affects the expression of svb/ovo. One way to think of them is that they are small parking spots for other proteins that will bind to the DNA sequences in each enhancer. These protein/DNA complexes will then fold around to make a little landing zone for the polymerase, to encourage transcription of the svb/ovo gene. This is why this is called regulatory DNA: it doesn't actually make the svb/ovo protein itself, but it's important in controlling when and where and how much of the svb/ovo protein will be made.

Now for some jargon; sorry, but you have to know what it is to follow along in the literature. Those little white boxes of regulatory DNA are often called cis factors, because they have to be located on the same strand of DNA as the protein-coding gene in order to work. In general, when we're talking about cis factors, we're talking about non-coding regulatory DNA. The complement of that is the actual coding sequence, the little gray boxes in the diagram, and those have the general name of trans factors.

There is a bit of a debate going on about the relative importance of cis and trans mutations in evolution. Proponents of the cis perspective like to point out that cis mutations can be wonderfully subtle and specific; you can make a change in an enhancer and only modify the expression of the gene in one tissue, or even a small part of one tissue, while changing a trans factor causes changes in every tissue that uses that gene product. Also, most of the cis proponents are evo-devo people, scientists who study the small variations in timing and magnitude of gene expression that lead to differences in form, so of course the kinds of changes that affect the stuff we study must be the most important.

Proponents of the trans view can point out that small changes in the coding regions of genes can also produce subtle shifts in what the genes do, and that mutations can also produce very large effects. Those cis changes appear to be little tweaks, while trans changes can run the gamut from non-existent/weak to strong, and so have great power. They also like to point out that most of the data in the literature documents trans changes between species, and that a lot of the evo-devo stuff is speculative.

It's a somewhat silly debate, because we all know that both cis and trans effects are going to be found important in evolution, in different ways in different organisms, and that arguing about which is more important is kind of pointless — it will depend on which feature and which species you're looking at. But the debate is also useful as a goad to urge people to look more at the subtleties and ask more questions about those enhancers, as in the paper I'm about to describe.

What is this svb/ovo gene?


This is a drawing of just the back end of a fly larva, and what you should be able to see is that they're very hairy. Dorsally, there's a collection of small hairs called trichomes, and ventrally there are some thicker, stouter hairs called denticles. If you destroy the svb/ovo coding region, these hairs don't form — svb is an important gene for organizing and making hairs on the cuticle of the fly. It's name should make sense: svb is short for shavenbaby. The gene is responsible for making hairs, but when you break it with a mutation you get embryos and larvae lacking those hairs, a shaven baby.

It also has the synonym of ovo, because it has another important function in the maturation of oocytes, something I'll skip over entirely. All you need to know is that svb/ovo is actually a large complex gene with multiple functions, and all we care about right now is its function of inducing hair development.

Now let's look at embryos of two different species of fruit flies, Drosophila melanogaster at the top, and Drosophila sechellia at the bottom. D. melanogaster is clearly hairier than D. sechellia, and you might be wondering if svb is the gene making a difference here, and if you're following the debate, you might be wondering whether this is a change in the trans coding region or the cis regulatory region.


One way to figure this out is to sequence and compare maps of the svb region in multiple fly species and ask where the actual molecular differences are. This isn't trivial: D. melanogaster and D. sechellia have been diverging for half a million years, and there have been lots of little changes all over the place, many of them expected to be neutral. What was done to narrow the search was to compare the sequences of five different Drosophila species with hairy embryos to the relatively naked D. sechellia, and ask which changes were unique to the less hairy form.

A hotspot lit up in the comparison: there is one region, about 500 base pairs long, in the enhancer labeled "E" in the diagram at the top of the page, which contained 13 substitutions and one deletion unique to D. sechellia, in 7 clusters. This is very suggestive, but not definitive; these are consistent differences, but we don't know yet whether these molecular differences cause the differences in hairiness. For that, we need an experiment.

The experiment.

This is the cool part. The investigators built constructs containing the E enhancer coupled to the svb gene and a reporter tag, and inserted those into fly embryos and asked how they affected expression; so they could effectively put the D. sechellia enhancer into D. melanogaster, and the D. melanogaster enhancer into D. sechellia, and ask if they were sufficient to drive the species-specific pattern of svb expression. The answer is yes, mostly: they weren't perfect copies of each other, suggesting that there are other elements that contribute to the pattern, but the D. sechellia enhancer produced reduced expression in whatever fly carried it, while the D. melanogaster enhancer produced greater expression.

But wait, there's more! The species differences were caused by differences in 7 clusters within the E enhancer. The authors built constructs in which the mutations in each of the 7 clusters was uniquely and independently inserted, so they could test each mutational change one by one. The answer here was that each of the seven mutations that led to the D. sechellia pattern had a similar effect, reducing very slightly the level of svb expression. Furthermore, they had a synergistic effect: the reduction in hairs when all 7 mutations were present was not simply the sum of the individual effects of each mutation alone.

What does it all mean?

One conclusion of this work is that here is one more clear example of a significant morphological difference between species that was generated by molecular modification of cis regulatory elements. Hooray, one more data point in the cis/trans debate!

Another interesting observation is that this is a phenotype that was built up gradually, by a set of small changes to an enhancer element. D. sechellia gradually lost its trichome hairs by the accumulation of single-nucleotide changes in regulatory DNA, each of which contributed to the phenotype — a very Darwinian pattern of change.

By modifying the regulatory elements, evolution can generate distinct, focused variations. Knocking out the entirety of the svb gene is disastrous, not only removing hairs but also seriously affecting fertility. The little tweaks provided by changes to the enhancer region mean that morphology can be fine-tuned by chance and selection, without compromising essential functions like reproduction. In the case of these two species of flies, D. sechellia can have a functional reproductive system, the full machinery to make functional hairs, but at the same time can turn off dorsal trichomes while retaining ventral denticles.

It all fits with the idea that fundamental aspects of basic morphology are going to be defined, not by the raw materials used to build them, but by the regulation of timing and quantity of those gene products — that the rules of development are defined by the regulatory activity of genes, not entirely by the coding sequences themselves.

Frankel N, Erezyilmaz DF, McGregor AP, Wang S, Payre F, Stern DL (2011) Morphological evolution caused by many subtle-effect substitutions in regulatory DNA. Nature 474(7353):598-603.