Search

The Online Encyclopedia and Dictionary

 
     
 

Encyclopedia

Dictionary

Quotes

 

Molecular systematics

Molecular systematics is a product of the traditional field of systematics and the growing field of bioinformatics. It is the process of using data on the molecular constitution of biological organisms' DNA, RNA, or both, in order to resolve questions in systematics, i.e. about their correct scientific classification or taxonomy from the point of view of evolutionary biology. It is also known as computational systematics. It may also be referred to as molecular genetics or molecular evolution, but both these terms have wider meanings. Molecular systematics is particularly important in, and compatible with, the cladistic approach to taxonomy; however the general cladistic approach is older than molecular systematics and does not necessarily depend upon it. From the mid 1990s onward, molecular systematic analysis has been causing revisions in the accepted classifications of many groups of organisms.

Contents

Theoretical background

Molecular systematics has been made possible by the availability of techniques for gene sequencing, which allow the determination of the exact sequence of nucleotides or bases in either DNA or RNA. At present it is still a long and expensive process to sequence the entire genome of an organism, and this has been done for only a few species. However it is quite feasible to determine the sequence of a defined area of a particular chromosome. Typical molecular systematic analyses require the sequencing of around 1000 base pairs. At any location within such a sequence, the bases found in a given position may vary between organisms. The particular sequence found in a given organism is referred to as its haplotype. In principle, since there are four base types, with 1000 base pairs, we could have 41000 distinct haplotypes. However, for organisms within a particular species, or in a group of related species, it turns out as a matter of empirical fact that

  • only a minority of sites show any variation at all
  • most of the variations that are found are correlated, so that the number of distinct haplotypes that are found is relatively small.

In a molecular systematic analysis, the haplotypes from a substantial sample of individuals of the target species or other taxon are determined for a defined area of genetic material. Haplotypes of individuals of a comparably sized sample of closely related, but supposedly different, taxa are also determined. Finally, haplotypes from a smaller number of individuals from a definitely different taxon are determined: these are referred to as an out group. The base sequences for the haplotypes are then compared. In the simplest case, the difference between two haplotypes is assessed by counting the number of locations where they have different bases: this is referred to as the number of substitutions (other kinds of differences between haplotypes can also occur, for example the insertion of a section of nucleic acid in one haplotype that is not present in another). Usually the number of substitutions is re-expressed as a percentage divergence, by dividing the number of substitutions by the number of base-pairs analysed: the hope is that this measure will be independent of the location and length of the section of DNA that is sequenced.

An alternative approach is to determine the divergences between the genotypes of individuals by DNA-DNA hybridisation instead of by determining and comparing gene sequences. The advantage of using hybridisation rather than gene sequencing is that is based on the entire genotype, rather than a particular section of DNA. Its disadvantage is that precise haplotypes are not determined.

Once the divergences between all pairs of samples have been determined, the resulting triangular matrix of differences is submitted to some form of statistical cluster analysis, and the resulting dendrogram is examined in order to see whether the samples cluster in the way that would be expected from current ideas about the taxonomy of the group, or not. Any group of haplotypes that are all more similar to one another than any of them is to any other haplotype may be said to constitute a clade. Statistical significance tests are available to examine whether it is possible to reject the hypothesis that a particular of haplotypes lie in a single clade.

Example: the phylogeny of the domestic dog

For example, Vilà et al (1997) determined haplotypes from a sequence of 261 base pairs in the mitochondrial DNA of 140 domestic dogs, 162 wolves, 5 coyotes, and 10 jackals (of three different species). The dogs were drawn from 67 different pure breeds and 5 cross breeds, and the wolves were drawn from 27 distinct geographically defined populations. The coyotes and jackals were used as the out group.

Vilà et al found 27 distinct haplotypes among the wolves, and 26 among the dogs. The wolf haplotypes differed from each other by no more than 10 bases, and the dog haplotypes differed from each other by no more than 12. The maximum difference between a dog haplotype and a wolf haplotype was 12 substitutions, whereas the minimum difference between a dog haplotype and any coyote or jackal haplotype was 20 substitutions. Vilà et al therefore concluded that their data supported the current classification of the domestic dog as a subspecies of the wolf rather than a domesticated form of some other species of canid.

Vilà et al then proceeded to use cluster analysis to construct dendrograms that grouped the different wolf and dog haplotypes by similarity. There are many different forms of cluster analysis, so they used several of them and showed that they all gave the same results, which were that:

  1. The correlation between traditional dog breeds and haplotypes was poor: many breeds contained several haplotypes and many haplotypes were found in several breeds.
  2. The dog haplotypes fell into four distinct clades, one of which included 19 of the 26 dog haplotypes and no wolf haplotype; the highest estimate of the mean divergence within this large clade was 1% (2.6 substitions).
  3. This major dog clade, and two of the other dog clades of these clades, fell into a single larger clade which also included some wolf haplotypes.
  4. The haplotypes in the fourth dog clade were more similar to a number of wolf haplotypes than to any other dog haplotype, and the wolf haplotypes in these clades were more similar to these dog haplotypes than to the other wolf haplotypes. The hypothesis that all the dog haplotypes fell into a single clade that did not include any wolf haplotypes was rejected in a significance test.

From their cluster analysis results, Vilà et al concluded that:

  • From 1: Traditional dog breeds are genetically diverse (i.e. they have been derived from a range of individuals of different descent)
  • From 2: Dogs and wolves have been largely isolated from each other for long enough for genetic coalescence to have occurred in most of the dog population.
  • From 3 and 4: Dogs do not derive from a single parentage. Hybridisation between dogs and wolves must have continued after the initial domestication of dogs, introducing new wolf genes into the domestic stock.

From the quantitative data on haplotype similarity, Vilà et al also proposed a new date for the first domestication of the dog. The first archaeological evidence of morphologically modern dog remains found in association with human remains is from 14000 years ago. On the other hand, palaeontological evidence shows that wolves and coyotes were separated about 1 million years ago. Since wolves and coyotes show a minimum 20-base divergence, we can estimate that divergence grows at a rate of about 1 substitution per 50,000 years. If all the dogs whose haplotypes are found in the large clade derive from a single parental line, we would expect that the 2.6-base divergence within that clade would have taken 130,000 years to emerge. Vilà et al therefore propose that the initial domestication of dogs occurred around 130,000 years ago, with some other event about 15,000 years ago leading to morphological change within the domestic dog population.

Characteristics and assumptions of molecular systematics

This example illustrates several characteristics of molecular systematics and its underlying assumptions.

  1. Molecular systematics is an essentially cladistic approach: it assumes that classification must correspond to phylogenetic descent, and that all valid taxa must be at least paraphyletic and preferably monophyletic.
  2. Whereas traditional taxonomy depends on the phenotype and behaviour of the organisms, molecular systematics goes direct to the genotype.
  3. Molecular systematics often uses the molecular clock assumption that quantitative similarity of genotype is a sufficient measure of the recency of genetic divergence. Particularly in relation to speciation, this assumption could be wrong if either
    1. some relatively small genotypic modification acted to prevent interbreeding between two groups of organisms, or
    2. in different subgroups of the organisms being considered, genetic modification proceeded at different rates.
  4. It is often convenient to use mitochondrial DNA for molecular systematic analysis. However, because in mammals mitochondria are inherited only from the mother, this is not fully satisfactory, because inheritance in the paternal line might not be detected: in the example above, Vilà et al cite more limited studies with chromosomal DNA that support their conclusions.

These characteristics and assumptions are not wholly uncontroversial among biological systematists. As a cladistic method, molecular systematics is open to the same criticisms as cladistics in general. It can also be argued that it is a mistake to replace a classification based on visible and ecologically relevant characteristics by one based on genetic details that may not even be expressed in the phenotype. However the molecular approach to systematics, and its underlying assumptions, are gaining increasing acceptance. As gene sequencing becomes easier and cheaper, molecular systematics is being applied to more and more groups, and in some cases is leading to radical revisions of accepted taxonomies.

See also

Reference

Vilà, C., Savolainen, P., Maldonado, J. E., Amorim, I. R., Rice, J. E., Honeycutt, R. L., Crandall, K. A., Lundeberg, J., & Wayne, R. K. (1997). Multiple and ancient origins of the domestic dog. Science, 276, 1687-1689.

The contents of this article are licensed from Wikipedia.org under the GNU Free Documentation License. How to see transparent copy