Recently in Education Category

Phylogenomic Fallacies

| 60 Comments

This is the fourth in a series of articles for the general public focused on understanding how species are related and how genomic data is used in research. Today, we talk about some common fallacies in phylogenomics.

Where do humans fit on the evolutionary tree of life? This is an important topic in evolutionary biology. A lot of people believe humans are the most important and highly-evolved organisms, but in reality, all modern species are equally evolved. Our natural tendency to assume that humans are evolutionarily superior has led to a few misconceptions about phylogenetic trees.

plants.png

To understand the first misconception, let’s look at a phylogenetic tree of plants (from “The Amborella Genome and the Evolution of Flowering Plants”). Eudicots and monocots are two classes of flowering plants, or angiosperms, and the plants in black are non-flowering plants. The term “basal” refers to the base of a phylogenetic tree, and a basal group is a species that branches closer to that base. The authors chose to label the angiosperms that are not eudicots or monocots as “basal angiosperms.” But this label is arbitrary; all the angiosperms are equidistant from the common ancestor and thus equally evolved. We sometimes tend to give more weight to branches that contain the species of interest and call other branches basal, almost assigning them a lesser importance. In this case, the species of interest is plants that consist of many foods that humans eat; a species is often deemed more important as it relates to humans. But modern species are equally evolved from a root common ancestor regardless of when their branch diverged from the common ancestor. To avoid confusion, it might be best to eliminate the “basal” term altogether.

This type of thinking also leads us to place humans at the end of phylogenetic trees. However, this placement is arbitrary and trees can be drawn in many equivalent ways. For example, compare a tree of primates with the branches rotated. The tree on the left, with humans at the top of the tree, is one you might see more often. But both of these trees are actually identical, and the relationships between species that can be inferred from the tree on the right is the same as the relationships in the tree on the left. Species at the tip of a tree are equidistant from the root common ancestor, so they can be considered evolutionarily equivalent.

primate tree 1.png

primate tree 2.png

Similarly, a common misconception is that humans evolved directly from monkeys. Monkeys, though, are modern species just like we are and have been evolving and changing over time. The common ancestor we share with monkeys may have looked much different than monkeys do now. This assumption that modern species represent an ancestral state of human evolution is what T. Ryan Gregory calls the platypus fallacy. Gregory uses the example that we can’t examine the traits of platypuses and think that humans at one point in their evolution possessed these same traits. We can no more infer the traits of human ancestor species from platypuses than platypuses can infer the traits of their ancestors from us.

Human-centered thinking is very prevalent in our society, affecting our laws, religions, and customs. While it probably influences all of us on a personal level, it can lead to false conclusions and misconceptions in science, like thinking that humans are the most highly evolved species. But all modern species are evolutionarily equivalent because they have been evolving for the same amount of time. Eliminating this fallacy will enable us to better understand the evolutionary process.

For more information on basal groups, check out: “Which side of the tree is more basal?, Krell, Frank et al. Systematic Entomology (2004).

This series is supported by NSF Grant #DBI-1356548 to RA Cartwright.

Analyzing the Genome with Statistics

| 19 Comments

This is the third in a series of articles for the general public focused on understanding how species are related and how genomic data is used in research. Today, we talk about the challenges of using statistics to analyze phylogenomic data.

Suppose you were a door manufacturer trying to figure out the average height of a population living in a certain country. You might conduct an experiment where you ask a group of people to report their height. You would then assemble those measurements in a data set. But in order to study this data set and draw conclusions you would need to analyze it using statistics. For example, how tall should your door be in order to fit 95% of people in the country? How many people do you need to survey to accurately represent the total population? These questions can be answered with statistical analysis.

Because acquiring data from experiments can be costly and time-consuming, we often use small data sets to represent a larger population of interest. In our height experiment, we would not be able to ask every single person in the country his or her height. We would choose a group of people under the assumption that they accurately reflect the population as a whole. However, when we are trying to map out the evolutionary history of organisms using data from sequenced genomes (phylogenomics, which we talked about last time), we need to change our method of analysis.

Let’s look at the treeshrew, for instance. It looks like a rodent but actually shares some internal similarities with primates (studied by Sir Wilfrid Le Gros Clark in the 1920s), like brain anatomy and reproductive traits. To figure out if the treeshrew is more similar to rodents or primates, we could sequence its genome and, using statistics, compare its genes to those of rodents and primates. But typical statistical models are based on subsets of populations, while by definition, genomic sequencing gives us a complete data set - all of the treeshrew’s genes. These typical models may not be suitable for interpreting genomic data.

The treeshrew. Source: Wikipedia

Before reaching a conclusion about the tree shrew, or any set of data, scientists must consider precision and accuracy. Multiple measurements of the same quantity are precise if they are similar to each other. Another way of saying this is that their variance is small. On the other hand, measurements are accurate if they are close to the true value of what they are trying to measure. For genomic data, we need better statistical tools to ensure that the accuracy of our conclusions matches the precision characteristic of these huge data sets.

Larger data sets provide more precise conclusions than smaller ones. For example, when we ask more people to report their height, we are more confident that our sample represents the variability of the actual population. Similarly, we analyze more genes in the treeshrew’s genome to increase our confidence that our conclusion is precise. However, our results might not necessarily be accurate; big data sets may lead us to draw incorrect conclusions with high confidence. The treeshrew’s genome contains some genes that are more similar to rodents’ genes and some that are more similar to primates’ genes (Fan et al., Nie et al., and Xu et al.), and with so much data we could find that the treeshrew is most similar to either group with high confidence. We need analysis tools that will tell us which genes give the correct answer.

Why are conclusions from data sometimes inaccurate? Statistical biases are external factors that produce consistent error in our measurements. Biases have many sources, including faulty experimental design, violation of assumptions made in analyzing the data, and errors in the data collection process. Bias in our height experiment might arise if we unintentionally ask the height of more women than men, causing our estimate of the average height to be lower. But in the case of phylogenomics, we are likely to have biases because of our relative lack of knowledge about the genome: we don’t always know which genes to analyze or the correct way to model the data. For example, some models assume that evolution followed the same pattern throughout all time, but this most likely was not the case.

Furthermore, the process of genome sequencing and analysis itself may create error, especially in the reconstruction of the genome and the alignment of genes for comparison. If we are comparing the genome of the treeshrew to the genomes of primates and rodents, it is difficult for us to know which genes are correlated between species when we are looking at a data set of billions of points. We might use a probability model to determine correlated genes, but all models are at least somewhat incorrect and introduce bias. In smaller data sets, biases are offset by a low precision and relatively small confidence in reaching conclusions. However, in genomic-size data sets, even small biases can be amplified and lead to high confidence in the wrong answer and incorrect phylogenetic trees.

When analyzing phylogenomic datasets, we need to use analyses that are appropriate for large data sets. This will unlock the potential of phylogenomic research to draw unbiased conclusions, like figuring out the correct phylogenetic classification of the treeshrew (still a topic of controversy among evolutionary biologists). However, phylogenomics is such a young field that these tools do not yet exist. When they are developed, we can increase our chances of correctly classifying species’ relationships and discovering the true history of evolution.

For more detail, check out: “Statistics and Truth in Phylogenomics”, Kumar, Sudhir et al. Molecular Biology and Evolution (2011).

References:

Fan, Yu, et al. “Genome of the Chinese tree shrew.” Nature communications 4 (2013): 1426.

Nie, Wenhui, et al. “Flying lemurs-The’flying tree shrews’? Molecular cytogenetic evidence for a Scandentia-Dermoptera sister clade.” BMC biology 6.1 (2008): 18.

Xu, Ling, et al. “Evaluating the Phylogenetic Position of Chinese Tree Shrew ( Tupaia belangeri chinensis) Based on Complete Mitochondrial Genome: Implication for Using Tree Shrew as an Alternative Experimental Animal to Primates in Biomedical Research.” Journal of Genetics and Genomics 39.3 (2012): 131-137.

Our next installment will cover some misused terminology in phylogenomics. This series is supported by NSF Grant #DBI-1356548 to RA Cartwright.

… because it (gasp!) uses the word, “abortion.” But wait – there is a glimmer of hope: The new superintendent, who was ordered to offer a plan for redacting the textbooks, says that the books comply with the law already and instead plans to hold a public discussion.

Meanwhile, as a service to the affected high-school students, Rachel Maddow has posted the offending page on a blog, ArizonaHonorsBiology.com, which her show apparently owns. If you are curious or have a prurient interest, you may also see the verso of The Page, as well as several other pages on human reproduction.

For the record, the book is Reece, et al., Biology: Concepts and Connections.

Phylogenomics: Deciphering a Billion-Piece Puzzle

| 146 Comments

This is the second in a series of articles for the general public focused on understanding how species are related and how genomic data is used in research. Today, we talk about phylogenomics, the application of whole genome sequencing to understand evolutionary relationships among species.

DNA Chemical Structure. Source: Madeleine Price Ball

The haploid human genome is 3.2 billion DNA bases long, and each base can be one of four nucleotides: A, T, C, and G. Uncoiled, the DNA in a single human cell would be 2 meters long, and the DNA in a human body would stretch from the sun to Pluto multiple times. With 3.2 billion bases, each person’s genome is unique, and this plays an essential role in shaping our physical and mental individuality. However, despite being unique, each human genome is very very similar, due to our shared ancestral heritage. Similarly, species that share a recent ancestral heritage also have similar genomes. Species that are distantly related are likely to demonstrate significant differences in their genomes. This is why, as we discussed last week, evolutionary biologists compare traits and genes to determine the relationships of different species.

Unfortunately, some genes give us the wrong answer about how species are related. A section of a gene can be identical for two species due to independent mutations. After all, any given base can only mutate into one of three other bases. Chances are the same mutation could happen twice, or multiple mutations can produce the same sequence. Consider two species that are distantly related; one contains an AGA fragment, while the corresponding fragment in the other species is TGT, i.e. they differ in 2 out of 3 positions. As these species evolve, by chance the first species may experience a change in the first position such that AGATGA, and the second species may experience a change in the third position such that TGTTGA. Now, these two sequences look the same so you might think the species share a recent common ancestor; however, it is only an accident of biology that they appear closely related. Because some fragments may be identical due to independent mutations and not shared ancestry, estimating species relationships with using whole genomes is better than just a few genes. The more information we have, the more likely we are to figure out species’ relationships correctly.

The cost to sequence whole genomes has fallen from $100 million to $1000 in just the past twelve years. It now takes days to sequence a genome compared to the 13 years it took for the first human genome. The challenge now is not to obtain the data but to compare all the billions of base pairs in one genome to those in another. Current sequencing methods, while fast, can only read the genome by dividing it into millions of short fragments, which must be reassembled like an enormous puzzle. Researchers then have to figure out which genes correspond to one another in different species’ genomes. These comparisons are challenging because genes in one genome might be in a different order, on different chromosomes, or missing completely in another species’ genome.

Biologists are beginning to use genomic information to understand how species are related and measure how fast or slowly different genes evolve. Then in turn allows us to understand how evolution happens. For example, using genomic information we can figure out how genes mutate, characterize and diagnose genetic diseases, and track harmful pathogens. But before that can happen, we need to address the difficulties of analyzing these large genomic datasets. You might think that more data is always better, but having a lot of data can lead us to have very high confidence in the wrong answer. In a pool of thousands of genes, we need to find the ones that tell us the right answer.

Next week, we’ll discuss statistical challenges associated with big data analysis, especially as it relates to phylogenomics. This series is supported by NSF Grant #DBI-1356548 to RA Cartwright.

The Family Tree of Life

| 92 Comments

In the next few weeks, we’ll be posting a series of articles for the general public focused on understanding how species are related and how genomic data is used in research. We start with a background on phylogenetic trees.

Imagine you could go back in time and meet your great grandmother or even your great-great-great-great-great grandmother, when they were your age. Would they look like you? Or would they look more like your siblings or cousins? Maybe you would all look a little different. Scientists try to figure out how the distant ancestors of apes, other animals, plants, and all organisms living today looked and behaved, much in the same way that people use a family tree to trace their ancestry.

primate-family-tree-780x520_0.gif

The common ancestor of great apes lived about 18 million years ago. Source: Smithsonian National Museum of Natural History http://humanorigins.si.edu/evidence/genetics

In evolutionary biology scientists use a type of tree called a “phylogenetic tree” to organize the history of how species descended from common ancestors. The closer two species are to a common ancestor on the phylogenetic tree, the more closely the two are related.

Take the phylogenetic tree of primates, for example. The common ancestor of apes lived about 18 million years ago. But over time, this one group branched off to form many different species, including humans, which have their own separate branch on this tree.

How did so many unique species develop from one ancestor? New branches formed by a process known as divergence. When groups of ancient organisms became geographically isolated from one another, either through migration or geologic events like earthquakes, each group began to develop its own unique set of physical attributes. Sometimes, by chance, a change in a characteristic enabled an individual to survive better in its environment and produce more offspring.

Perhaps individuals in one group with larger arms were better able to break open the hard-shelled fruits that were common in one region, while some individuals in another group had the ability to travel more easily through tall trees that offered protection from predators. Whatever the reason may have been, selection favored genetic differences that improved survival. Over time, this gradual process of isolation and selection produced distinct species, which in turn branched into more species.

The end result of divergence is many species, related in a tree-like fashion, and we display these relationships using phylogenetic trees. Scientists now use increasingly sophisticated methods to determine how species were related and build phylogenetic trees. In the past, scientists built these trees simply by comparing physical traits, like how many limbs an organism has or whether it has a tail. But with the recent surge in fast and affordable gene sequencing technologies, researchers today can directly compare species’ DNA to determine how they are related.

But analyzing entire genomes, with billions of DNA base pairs, presents its own unique set of challenges, and researchers often struggle to determine if the DNA differences they find between species are truly significant or are simply due to common variability. As computer software and statistical analysis become more adept at handling these challenges, our understanding of species’ relationships could change — providing exciting new insights into our family tree of life.

Check back next week when we discuss the differences between studying small and large datasets, and the challenges associated with big data analysis. This series is supported by NSF Grant #DBI-1356548 to RA Cartwright.

This cartoon

christian-unity-cartoon_600.jpeg

Ken Ham, The Lie: Evolution, illustrations by Steve Cardno (Master Books, 1987). See also here.

was shown as part of an otherwise innocuous PowerPoint presentation to a freshman biology class at Grady High School in Atlanta and caused a bit of a flap, according to The Atlanta Journal-Constitution.

The teacher and the district science coordinator apparently refused interviews, but the student newspaper reports, based on interviews with students, that the teacher did not teach evolution and seemed to favor creationism.

I occasionally get books for review unsolicited, and many of them are not worth noticing. However, Kostas Kampourakis' Understanding Evolution is a wonderful resource for students of all kinds, including biology students.

9781107034914.jpg

NCSE webinar, “Talking to the media about science education,” tomorrow, February 27, at 11:00 PST. You may register here or view the webinar, along with earlier webinars, here.

According to NCSE’s announcement,

The panel will include: Robert Luhn, Director of Communications for NCSE; Liz Craig, a freelance writer and board member with Kansas Citizens for Science, and David Wescott, director of digital strategy at APCO Worldwide. Luhn leads NCSE’s media outreach efforts, and has been a journalist for 40 years for technology, environmental, and medical publications. Craig led KCFS’s media strategy through the 1999 and 2005 battles over creationism before the state board of education and is a freelance writer covering a range of topics. Wescott, formerly a staffer for Sen. Kennedy, develops and implements online outreach strategies on topics including education, science, and the environment for an international clientele. Moderator Josh Rosenau is a programs and policy director at NCSE.

An AFP press release the other day noted that 1 in 4 Americans does not know that the earth revolves around the sun, according to a poll of 2200 people conducted by the National Science Foundation. Additionally, approximately half do not know “that human beings evolved from earlier species of animals” – or, perhaps more precisely, will not admit it. The average score on the 9-question quiz was 6.5. Americans nevertheless remain “enthusiastic” about science. The survey is part of a report that NSF will submit to the President. I could not immediately find any further information.

That is the title of a Slate article by Zack Kopplin. But actually it is much worse (see also NCSE’s take here). Here are the first 3 paragraphs of Kopplin’s article.

NCSE has just announced the second webinar in its ongoing series, to be held on December 18, 2013, at 1:00 p.m. PST. The webinar will focus on “[s]topping bad legislation and encouraging policymakers to support strong science education…,” according to NCSE.

The webinar will be led by Josh Rosenau, Programs and Policy Director for NCSE; Vic Hutchison, professor emeritus at the University of Oklahoma, and founder and past president of Oklahomans for Excellence in Science Education; and Dena Sher, legislative counsel at the ACLU’s national office. You may register for the webinar here.

We reported on NCSE’s earlier webinar here.

Here’s some reading material for you: A new article by Chris Mooney, posted at Mother Jones, argues that we have certain psychological dispositions that make it easier for us to accept religion than evolution. Larry Moran was not impressed with the article. Neither was Jerry Coyne. But I think the article was a bit better than they suggest, and I make my case in this post over at EvolutionBlog. Comments may be left there. Enjoy!

The National Center for Science Education has just announced a webinar on what to do when science comes under attack. Details below the fold.

Genie Scott has announced her retirement, and Ann Reid will take over as new Executive Director of the National Center for Science Education. Congratulations to both Dr. Scott and Dr. Reid! Dr. Reid is a research scientist whose team sequenced the 1918 influenza virus at the Air Force Institute of Technology. One colleague credited her with the additional ability to herd cats. See the NCSE press release here.

A recent Gallup poll concluded that Americans consistently rate math the most valuable subject they took in school, ahead of English, science, and history. Specifically, 34 % of those polled in both 2002 and 2013 rated math the most important subject. English, meaning English, reading, and literature, came in second, with 21 % in 2013 rating English the most important. Between 2002 and 2013, incidentally, science jumped from 4 % to 12 %. Figure 1 shows Gallup’s results for 2002 and 2013 in graphical form.

GallupPoll2013_1.jpg

Figure 1. Percentage of responses to Gallup polls taken in 2002 and 2013. Mathematics held firm at 34 %, whereas science increased from 4 to 12 % at the expense of English, reading, and literature.

By Brianne Fagan.

This column by Brianne Fagan, a senior majoring in chemical engineering at the Colorado School of Mines, is a response to a recent New York Times column on women in science. It was prepared as part of a class on Explorations in Science, Technology, and Society. The class is co-taught by physics professor Lincoln Carr and Toni Lefton of the Liberal Arts and International Studies department. The course is offered through the McBride Honors Program.

During a class presentation about Kate Kirby, one of my peers brought up some statistics about girls in math and science while sharing her own motivations for pursing environmental engineering. During a related discussion, the two female Physics students both discussed their mostly positive experiences as women in the Physics Department at the Colorado School of Mines. The question always seems to remain, though: Why do so few girls pursue degrees in STEM (science, technology, engineering, and math) fields?

Well, I was perusing the articles on my New York Times app this morning, and what did I find?

About this Archive

This page is an archive of recent entries in the Education category.

Designoids is the previous category.

Education and Legal is the next category.

Find recent content on the main index or look in the archives to find all content.

Categories

Archives

Author Archives

Powered by Movable Type 4.381

Site Meter