Turned up to eleven: Fair and Balanced

Thursday, September 26, 2002

Chimps and Humans
"Robert Musil" asks the burning question on everyone's mind;

Where prior studies suggest that 98.5% of the human genetic code can also be found in the chimp, a new study published in Proceedings of the National Academy of Sciences says the true overlap may be only 95%.
My confusion arises because I have never been able to determine from the popular media coverage (such as this linked article) if the "overlap" studies include comparisions of inactive genetic material. If the studies do include such comparisions, then why is it not possible that much of the active human code only overlaps with the inactive chimp code - and vice versa. Wouldn't that mean that a 95% (or 98.5%) "overlap" could be all but meaningless?

This opens the door to a couple of interesting issues. How do we determine how closely related two different organisms or species are, and how do we determine how much of the genome is "used"? Neither is a completely simple question to answer, but I will give it a shot. First, to the matter that Mr. Musil brings up. There are historical reasons for the 98.5% figure that has been commonly been used, but it has borne up under fairly intense scrutiny up till now. The initial studies used a technical analysis called a "CoT curve" (stands for concentration over time). This is also used to define the "complexity" of a genome, in a relative way. The method uses the biochemical tendency of DNA to reassociate after "melting" (heating to separate the two strands from one another), measuring the time it takes for double stranded DNA to re-form. Measuring the tendency of identical strands to reassociate gives a method for determining the amount of "junk DNA" in the chromosome. This is where the early understanding that most of the human genome was "junk" came from. As an aside, it may well turn out that that DNA is actually important, for some structural and regulatory reasons. The classic rehybridization experiment uses a method to tag the genomic DNA from one of the two organisms (say, a radioactive label), and then measures how much of the radiolabel sticks to an immobilized population of unlabelled DNA. There are lots of ways to think about doing this, and I am only giving you one. The point is that you can determine how strongly two pieces of DNA stick together, and this is a chemical way to determine how closely related they are.

In any event, the answer to Musil's query is "No, the comparison is not meaningless". That is, both the repeat DNA and the "useful" DNA seem to be very closely related. (there have been reports of even more closely related genome segments). The initial evidence for this very close relationship was based on DNA hybridization, as outlined above. Later studies, however, based on sequence analysis, bore out the hyb data. So what happened with this newest experiment?

(Incidentally, it is now thought that chimps are not our closest great ape relatives, that we are as closely related to bonobos)

Basically, the experimenter (on the faculty at Caltech, where I work), suggests that previous experimenters have underestimated the importance and prevalence of certain types of genetic change. We tend to think of mutation as being changes in the sequence at a single point (imaginatively called a point mutation), but there are other, perhaps more important ways to change the genetic code. Here are some examples:

Initial sequence: AGCTAGCTAGCT

Point mutant: AGCTAGGTAGCT

Deletion mutant: AGCTAG_TAGCT

Insertion mutant: AGCTAGCCTAGCT

The important thing to not in the latter two cases is that the insertion or deletion can be as long or short as you wish. It could be one base, 1000, up to an empirical limit on the order of a few hundred thousand bases (further investigations may turn up larger insertions). The insertion or deletion may be a result of recombination during DNA replication (someday, if you are very bad, I will explain this in excruciating detail), or as a result of insertion of foreign DNA such as a virus or transposon. Regardless, the end result is a serious disparity between the genomes being compared. So, Dr. Britten suggests, based on his analysis of a few million base pairs of the genomes of chimps and humans (about 0.1%), that we may be off in our estimate, and that it is probably closer to 95%. As far as Mr. Musil's concern goes, this study does not distinguish between "useful" DNA and "junk". The best guess, however, is that there isn't going to be a difference, but that, if anything, the functional genes are probably more similar, because they are constrained by their function. That is, the gene that encodes for hemoglobin is more restricted than an equal sized region that does not encode a gene, because what ever changes that hemoglobin gene, its essential function must be preserved. So, it is most likely that the "junk" DNA is not identical, with all the functional genes varying, but rather the other way around. In this sense, the very close similarity between our "junk" and chimp "junk" is further evidence of our close evolutionary pairing.