Monday, July 13, 2009

Whoa, I sort of understood something...

After getting a presentation by Todd Vision on Trait Mapping, he suggested that we check out a report in Science ("Genomic Footprints of a Cryptic Plastid Endosymbiosis in Diatoms", Ahmed Moustafa, et al., 324, 26 June 2009). Since I saw a post about "how to learn about everything" I've become a little more open to just diving in and attempting to read papers. I thought this was probably a good candidate.

The most interesting thing to me was seeing how the authors structured their arguments using two fully mapped genome (one from green algae and the other a red algae) to build gene trees and pulled out several hypothesis from the results. From my twisted computer-influence mind, it was like they were decompiling the Diatom and attempting to build possible explanations for what they were seeing in the current genetic code. I have been thinking about DNA analysis as working backwards from executable code for a while (probably since I ran along this blunt viewpoint).

I still do not fully understand why the Endosymbiosis Gene Transfer (EGT) events were more likely than a missing common ancestor. They made a great argument to why that was quite unlikely, but I did not follow (I really am lacking in my science background). But Lateral Gene Transfer (LGT) events are new to me - I did not realize that happened. I do remember one late evening in Durham hanging out with folks working on AFTOL (Assembling the Fungal Tree of Life) at Duke. They asked what tools were used on BTOL (assembling the Beetle Tree of Life), and commented that we did not have to deal with host/parasite detection (since having a beetle living inside another beetle was quite unlikely, but the fungi inside of fungi does happen). It was then that I realized that fungi and plants were a far more complex area (and it did not dawn on me then that this might result in host & parasite swapping genes).

I think Tal Dagan and William Martin's discussion ("Seeing Green and Red in Diatom Genomes", Science, 324, 26 June 2009) helped me realize the importance of Moustafa, et al.'s report. I did not realize after reading Moustafa, et al. that the plastid of a red algae EGT event had completely replaced what they believe was a plastid supplied by the early green aglae EGT event they argue explains the current Diatoms make-up. But both articles made it clear that 16% of the Diatoms genes coming from green algae was unusual.

Having become somewhat comfortable with simple phylogeny concepts and terms, delving into these two articles was a bit of an adventure. I remember tuning out in Biology class (I thought the Latin names for organisms showed how unlikely it was that I'd need to know the material - after all, I was Catholic and the church realized that Latin was silly and mass was no longer in Latin...so what was up with Vombatidae?!?! Or Canis Lupus?!?!). Phylum, Kingdom, huh? I never expected to end up using the information from Biology class. I was naturally interested in Science from an aerospace perspective, being the son of an engineer working for a commercial aviation company. I loved airplanes and NASA stuff. It seems it is never too late to get back into biology and science.

Tuesday, July 7, 2009

Modernizing the Tree of Life

I am in the process of reading papers & articles on the topic of Phylogenetics to acquire domain knowledge. I started with a topic I am familiar with: the Tree of Life.

Modernizing the Tree of Life
Elizabeth Pennisi, et al.
Science 300 1692 (2003);
DOI: 10.1126/science.300.5626.1692

I am not sure how most biologists & systematists feel today, but I am guessing the idea suggested here was something big in 2003. What idea? That all of life could be "barcoded" using one mitochondrial gene (cytochrome oxidase I). The article, in general, talks a bit about the field and how larger projects are going to radically change the landscape (this article was published the same year that CIPRES started). And this effort, barcoding organisms, could help move us toward a time when all of life could be easier & quickly identified with a small, Star Trek-like "bio-coder" device (something also suggested in David Hillis' Discovering Life on Earth talk to Villanova University). I doubt the relevance of this concept now having seen my former supervisor/mentor, David Maddison, practice a talk refuting that a specific group of beetles (descendents of Bembidion, which he studies) were not easily classified using any gene - let alone the mitochondrial gene suggested. I do find it interesting that there is an effort to strive for a unique identifier for organisms. Given that some researchers estimate there are 10 million undescribed species out there, it would seem that a hashing approach might be more feasible. Take, say, 5 or 10 genes - create a hashcode from them and then chose your favorite way to resolve collisions (or do nothing - just add them to a list). That is just a wild & naive guess - but it seems like barcoding is a pipedream.

The article still helped give me the mood of the community and an idea of the debates that exist. They also made an important point - researchers agree more than they disagree. So even if it seems there are numerous hotly debated arguments raging, there exists a large set of facts that they agree on.

For me, the sidebar on the construction of the Tree of Life for plants was probably the highlight. I had heard some mention of Deep Green - but was not really clear on the project or the purpose of it. So now I can queue up another article to read (Science, 13 August 1999) and a project to research. I thought the view of the plant science community was also interesting to see:
"...plant people have a good handle on all sorts of data, almost to the point of being truly comprehensive,” says John Gittleman, an evolutionary biologist at the University of Virginia,Charlottesville.
Having good data is one thing, but sharing it with the rest of the community is really the goal. You see that with efforts like NESCent's Hackathon on Database Interopability. It's good to know the data is there; it sounds like Deep Green, Deep Gene, Deep Time, and Deepest Green have the community in a good position. I can see why the push for a cyberinfrastructure would follow those advancements.

Another good piece of insight was that supertrees are not all that useful. I have a bit of a background with phylogenies (having worked on the Tree of Life Web Project). But I thought supertrees would be helpful to researchers:
“Supertrees don’t have much value,” says Ward Wheeler
He said that they're just 'summaries of summaries.' Supertrees are easier to build, but if they have no value then they're pointless to construct.