Y and mtDNA are not Adam and Eve: Part 3 - Resolving a discrepancy

| 18 Comments

Part 1: Y and mtDNA are not Adam and Eve.
Part 2: What it means to be the Most Recent Common Ancestor (MRCA)

On to part 3. Except, what’s this? Someone has beat me to it? Gasp!

Okay, go read Dienekes’ Anthropology blog post about the two recent Y papers. I agree with all of the critiques and summaries of both the Poznik et al. (2013) and the Francalacci et al. (2013) papers. Perhaps the best part of this summary:

“And, indeed, the fact that the two are of different ages is not particularly troubling or in need of remedy, since for most reasonable models of human origins we do not expect them to be of the same age.”

But, let’s see if I can provide a little more background (you did go read Dienekes’ post, right?). Good. But, just in case you didn’t, a brief summary of some of the findings of Poznik et al. (2013):


Poznik et al. sequenced a lot of Y chromosomes
Poznik et al sequenced the Y chromosomes of 69 people. Yes, this is more than enough individuals to address this question of the time to the MRCA. Very few lineages, even two can allow us to estimate the time to the most recent common ancestor. Each lineage will contain information (mutations) that have accumulated since they diverged. But, comparing closely related lineages will lead to a lower TMRCA, while comparing very divergent lineages will lead to an older TMRCA. You can learn a lot about the process of how the Y chromosomes diverged by having the intermediate lineages, but they aren’t necessary for computing the TMRCA.

In the pictures below, the red dots are observed mutations.

If there aren’t very many mutations between two regions, then their TMRCA will not be very long ago:

Comparing two closely related lineages gives a younger TMRCA



If more Y chromosomes are analyzed from very different regions, then more mutations will be observed between any pair of lineages, and the TMRCA will be older.

The more diverged the lineages, the older the TMRCA



But, for just estimating the TMRCA, the total number of lineages will have very little (if any) effect on the age estimated, whereas the number of differences observed on the most diverged Y chromosome will be very important.

For estimating TMRCA,  the most diverged lineage (Y5) will have the biggest effect. 



How to estimate the time to the most recent common ancestor (TMRCA)
The time estimated depends very little on the number of lineages (see above). Rather, it is extremeley dependent on:
1) how different the lineages are from one another (how many mutations are observed); and,
2) how quickly those differences are estimate to have accumulated (the rate of mutation).

The Y TMRCA is older than most previous estimates
The time to the most recent common ancestor of the Y chromosome, as computed by Poznik et al. is older than most other estimates. But why?
- The diversity of the Y chromosomes included (the more diverse the Y chromosomes, the older the time to their most recent common ancestor)
- the high sequencing coverage, which means that more mutations can be identified
- The rate of mutation the authors use is 0.82x10-9 mutations per base pair per year (95% CI: 0.72-0.92x10-9 mutations per base per year). This mutation rate is lower than estimates from a Y-linked pedigree (1x10-9 mut/bp/year), and from human-chimpanzee divergence, which lengthens the tree compared to previous estimates. The mutation rate was calibrated assuming that humans reached the Americas ~15,000 years ago. Such an exact timing for the entry of modern humans to the Americas is not yet certain.

The Y TMRCA is not as old as it could be
Mendez et al. (2013) recently described a Y chromosome that is much older (67% w/ 95% CI:35-126%) than all other known Y chromosomes. This Y chromosome has not yet been sequenced to the coverage of the Y chromosomes in the Poznik et al. (2013) paper, and was not included in their analysis. If it were included, all other factors remaining the same, the TMRCA for the Y chromosome would be much older than the TMRCA for the mtDNA in the same paper.

The mtDNA estimate is younger than many other estimates
Although there has been a lot of discussion of the Y chromosome being older than previous estimates, I haven’t seen a lot of discussion about the mtDNA, which at 99-148,000 years in this analysis, is estimated a bit younger than previous work (~200,000 years ago). Part of this younger estimate can be contributed to the calibrated mutation rate used. The authors compute a calibrated mtDNA mutation rate of 2.3x10-8 mutations per base pair per year (95% CI: 2-2.5x10-8 mut/bp/year), which is higher than some previous estimates (e.g., 1.7x10-8) - meaning the total tree will be somewhat shorter than previous estimates, all else being equal.

I am excited to see if there exist pockets of mtDNA diversity, such as the highly divergent Y lineage that was recently identified.

So, what is the right mutation rate?
If the mutation rate used across studies varies so much, then it is no surprise that the TMRCA estimates are not consistent across studies. Which one is correct? Well, of course it is_<insert your favorite study>. Okay, so the real answer is that it is not so simple. I know, I know, not the answer you were looking for. It’s like when you have a multiple choice question with four answers and you have to choose the one that is most correct. I never did well on those. I’ll dodge this bullet by pointing you to a wonderful discussion about human mutation rates by John Hawks.

It is exciting, though, that with the recent ability to isolate and sequence DNA from ancient samples, we should start getting more precise and accurate, estimates of the human mutation rate on the different chromosomes.

One more thing - there is no reason to expect the TMRCA for the Y and mtDNA to be the same.
The process of working backwards to estimate the time to the most recent common ancestor is a paring down of lineages until only one linage remains. This is called coalescent theory. Because they lack recombination, both the Y and the mtDNA represent a single linage, a single coalescent process going back in time. Any number of events could have happened that resulted in a set of mtDNA or Y chromosome lineages being retained longer or shorter than expected. The TMRCA is only the time to the *most* recent common ancestor. There were other ancestors, but we can only identify the most recent. And there are a myriad of reasons why these might not necessarily date to the same time for the Y and mtDNA.

But, why don’t we expect the TMRCA to be the same?
To be clear, it is not that we expect them to be different. More that we don’t expect them to be the same.

I’m going to make a gross over-simplification (we can do more math in the comments, if you like). But, bear with me. Let’s say that you had two dice. If you roll each die once, just once, would you be very surprised if the numbers didn’t match up? No, not at all. Likewise, you wouldn’t be shocked if, say, each die showed a six. And, if one die showed a two, while the other showed a six, you probably wouldn’t call it a discrepancy. Why? Because you only rolled them once.

Similarly (although with a bit more math), when tracing back the Y common ancestor and the mtDNA common ancestor, we should not be surprised if their TMRCAs are different, nor if they overlap.

They represent only one roll of the dice.

————————————-

Science. 2013 Aug 2;341(6145):562-5. doi: 10.1126/science.1237619.

Sequencing Y chromosomes resolves discrepancy in time to common ancestor of males versus females.

Source

18 Comments

Another reason why the mitochondrial and Y chromosome dates are not expected to match is the effective population size for each marker. This parameter will affect the time to coalescence. If males leave more offspring per individual on average that females, then the effective population size would be larger and the time to coalescence would probably be greater. And of course the nuclear DNA would have a different effective population size and recombination as well.

Sorry if this was covered previously.

No, this is a great point! It is one that I am still struggling with in the Poznik et al. paper.

Under neutral (no selection), equilibrium (equal reproductive success) expectations, the Y and mtDNA should both have an effective population size that is one-quarter of the autosomal size. Poznik et al don’t analyze the autosomes or X chromosomes, so we don’t have a reference, but they do provide estimates that suggest the effective population size of the Y is much smaller (48-33%) than the effective population size of the mtDNA. Intuitively, this should have resulted in estimates of the TMRCA that are much smaller for the Y than for the mtDNA. However, if the Y has pockets of longstanding diversity, as the Mendez paper suggests, and as those highly diverged A Y-haplogroups suggest, then perhaps it is less surprising. Also, as I mentioned, both the Y and mtDNA represent only a single coalescent, so it can happen that different (I guess even wildly different) effective populations sizes can have similar TMRCAs.

Still, it is counterintuitive, to me that the effective population sizes are so different, but the coalescence times so similar.

M. Wilson Sayres said: Still, it is counterintuitive, to me that the effective population sizes are so different, but the coalescence times so similar.

The standard deviation in the time to the most recent common ancestor is just over half the mean time, so large fluctuations are to be expected. That’s for a constant-sized population; for a structured population, which the new Y chromosome findings (among other evidence) suggest, the variance should be even larger.

Thanks for the added clarification, Steve.

Am I correct that one of these papers sampled only people from Sardinia while the other sampled overwhelmingly Eurasians? If you’re looking for the coalescent of all extant Y chromosomes, shouldn’t you be hitting Africa much more than any other area?

Likewise, there seems to be some attempted linkage between the coalescent and the origin of H. sapiens, which makes no sense. Or am I imagining that?

DS said:

If males leave more offspring per individual on average that females, then the effective population size would be larger and the time to coalescence would probably be greater.

Um, if each newborn has one male and one female parent, then the total contribution of males to the next generation is *exactly* as large as the total contribution of females. But if there are more males than females, the contribution of males per male is correspondingly smaller.

However the variance in the number of offspring they leave could be different, most probably higher in males. And that also means that the Y chromosome effective population size would be smaller than the mitochondrial effective population size.

However a big effect on the time to coalescence of mitochondria versus the time to coalescence of Y chromosomes will just be the randomness of these times caused by random ancestry of gene copies. If we have two copies of a gene that is on mitochondria and ask how far back to their common ancestor copy, the math is the same as tossing a coin with heads probability about 1 in 10,000. It will take about 10,000 tosses. Similarly for two copies of a gene on the Y chromosome.

So even if both effective population sizes are precisely the same, the chance that the number of tosses until the two mitochondrial copies coalesce (let’s say happens to take 13,262 tosses) is *exactly* the same as the number of tosses until two Y copies coalesce is very very small.

Things are more complicated because there are more than 2 copies of each gene in the samples in these studies, and also because males often have a longer generation time than females. Also geographic structure (subdivision of humans into populations with restricted gene flow) causes more complications.

Joe Felsenstein said:

DS said:

If males leave more offspring per individual on average that females, then the effective population size would be larger and the time to coalescence would probably be greater.

Um, if each newborn has one male and one female parent, then the total contribution of males to the next generation is *exactly* as large as the total contribution of females. But if there are more males than females, the contribution of males per male is correspondingly smaller.

Let’s say you have a population of 10,000 adults, with 5,000 males and females each.

Now let’s say there’s a war in which 4,000 of the males are killed, but none of the females. The remaining males take four wives each. Each polyamorous family has 4 kids (that live to adulthood), one by each female.

DS’ condition will now be fulfilled.

That’s a toy example. I have no doubt you are right, things are more genetically complicated. But it is relatively easy to see how, at least the way humans tend to act, that your average male surviving to procreate could leave more offspring than your average female.

Yes, in that case there are more females than males, so the average contribution of males is larger, as I did in fact say.

John Harshman said:

Am I correct that one of these papers sampled only people from Sardinia while the other sampled overwhelmingly Eurasians? If you’re looking for the coalescent of all extant Y chromosomes, shouldn’t you be hitting Africa much more than any other area?

This is exactly why including the recently identified African American Y chromosome that is similar to those found in a small region in Cameroon would have resulted in an older TMRCA for the Y.

Both papers do include a few divergent A lineages (and all you need is one to push the TMRCA back), but you are correct that the matter is far from settled. It is likely there other pockets of Y (and perhaps also mtDNA) diversity waiting to be discovered.

Joe Felsenstein said:

Yes, in that case there are more females than males, so the average contribution of males is larger, as I did in fact say.

Well then, that must be what I meant.

Thanks Joe.

M. Wilson Sayres said: Both papers do include a few divergent A lineages (and all you need is one to push the TMRCA back), but you are correct that the matter is far from settled. It is likely there other pockets of Y (and perhaps also mtDNA) diversity waiting to be discovered.

Well, now, Africa has been fairly well sampled for mt lineages. I’m just amazed that one would try to find the Y coalescent by sampling just one small island in the Mediterranean, no matter how intensively. Given a complex history of invasions, it’s better than sampling, say, Skye. But not by enough.

Sorry for being dismissive.

Actually, to quibble with you and with myself, if many males are killed off, that does not change the sex ratio (as we measure that at the end of parental care). But it does increase the variance of offspring number, which in turn reduces the effective population size. Many males have zero offspring, and the rest have more, so the variance is thereby increased.

The net effect is the same but the terminology is different.

John Harshman said:

Well, now, Africa has been fairly well sampled for mt lineages. I’m just amazed that one would try to find the Y coalescent by sampling just one small island in the Mediterranean, no matter how intensively. Given a complex history of invasions, it’s better than sampling, say, Skye. But not by enough.

Don’t quote me, but for both papers, I don’t think the original intent was to estimate a new TMRCA - these are just things that make big headlines. One of the less cited findings from Poznik et al. (that I may get to in another follow up) is that they identify a single mutation that resolves a previous polytomy (unresolved branching) of some Y lineages that appear to have rapidly diverged from one another.

Similarly, Francalacci et al. (2013) use the closely related lineages to learn about the relationship of Y chromosomes (as a proxy for the ancestral movements of the populations in which they are found).

I’m still saving a “big” question, but in the meantime I have a “little” one that confuses me. Who exactly expects that the age of the last common ancestor via strictly female, and strictly male, lineages must be the same (plus or minus a lifetime)? Committed Biblical literalists will not accept any evidence, and the anti-evolution activists who exploit them will look only for evidence that they can take out of context to further mislead them (anything that’s “more recent than previously thought” particularly makes them ecstatic). As far as I can tell, everyone else will react with “sure, that makes sense.”

As far as I can tell, the primary objective of this work, seeking LCA of genes on the Y chromosome and on mitochondrial genomes is to supplement our understanding of human prehistorical demographics.

It also seems to be designed to give a nod to social genealogy, which traces exact individual ancestry, even though this aspect may actually serve to fuel confusion.

Beyond the fact that there must have been some point at which nothing we would call life existed on earth, and some earliest point at which there was something that any reasonable person would call life on earth, even though we can never know exactly when that was, all tracing of genetic lineages is definition-dependent. In fact, even saying that genes originated with life is definition-dependent.

“Y-chromosome Adam” obviously had a father. He got his Y chromosomes from his father. The ones that went into Adam’s particular sperm that fertilized ova must have differed, at least on the loci studied, either due to germline (in Adam) or somatic mutation, in some way, from his father’s germline Y-chromosome (I’m using single generation language here to make the point obvious, please bear with me, we all realize that our studies don’t have this degree of precision, but this language makes the point more clear). Otherwise, we’d be tracing things back to his father, not to him.

However, the Y chromosome itself has ancestors and so on. There was never a time when the human population of earth was two, nor, likely, when the reproducing human male population was one. If we follow individual polymorphisms they do trace back to founders. The more narrowly we specify the polymorphism we study, the more recent the common ancestry we will discover. If we try to trace back the cytochrome C gene, broadly defined, we’ll say that the LCA of all life with cytochrome C was a very ancient unicellular organism. If we take some specific non-coding neutral polymorphism in the gene that occurs in some island population, we might be able to trace it back to a recent historical ancestor with ease.

Melissa: related to Y-chromosome Bottleneck Guy, I am having a hard time with this sentence from Neil Shubin’s latest (The Universe Within), p 175: “DNA of Native Americans reveals that they are derived from a single male who likely crossed the Bering Strait when an ice bridge formed during the last ice age.” Why could not Y-cBG for the Amerindians be somewhere further back in the Asian population, perhaps much further back? Is it really likely that Y-cBG himself crossed the Strait, and how could we know this?

The “Y-cBG” label doesn’t stay affixed to the same person indefinitely. If all but one of the Y chromosomes that crossed into the Americas have since ceased to have living descendants, then that one (or possibly one of its descendants) becomes the holder of that title. (That’s if I’m properly understanding what “Y-cBG” means. )

(Of course, the holders of other Y chromosomes might still have living descendants in lineages that contain one or more females; so their other chromosomes might still have descendants.)

Yep, what Henry said.

There was a group of people (the number can be estimated from the amount of autosomal diversity) who entered the Americas.

The one Y (if indeed it is only one Y) that is the common ancestor of all the other Y’s was not the only Y chromosome at the time, and the person who housed this Y chromosome also had 22 pairs of autosomes and an X chromosome that mixed with the autosomes and X chromosomes from the other people around at the time.

About this Entry

This page contains a single entry by M. Wilson Sayres published on August 23, 2013 1:03 AM.

Tim White video on Science Friday was the previous entry in this blog.

Giant panda gives birth is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.

Categories

Archives

Author Archives

Powered by Movable Type 4.381

Site Meter