7 Molecular Genetics and Relatedness
Learning Objectives
After exploring this chapter, you should be able to
- Calculate the percent similarity between two individuals based on a DNA sequence
- Explain how DNA similarity informs us about evolutionary history
- Be able to read a phylogenetic tree and explain how it relates to genetic similarity
Scientists today use molecular genetics as a powerful tool to determine how living things are related. Every organism’s DNA is a record of its ancestry: individuals that are more closely related have more similar DNA sequences. By analyzing DNA, researchers can infer relationships within species (like between individual people) and among different species (across the tree of life). In the past, biologists relied on observable traits (morphology) to classify organisms, but DNA evidence has both confirmed and sometimes corrected those classifications.The basic idea is simple: the more similar the DNA between two organisms, the more recent their common ancestry and the closer their evolutionary relationship.
DNA and Relatedness Within a Species (Humans)
Because DNA is inherited from parent to offspring, it can reveal how closely individuals are related. Within our species, human DNA is actually extremely similar from person to person. Out of the 3.2 billion “letters” (nucleotide bases) in the human genome, only a tiny fraction differ between any two people. In fact, on average any two unrelated humans are 99.9% identical in DNA sequence. It is the 0.1% of DNA where we differ that makes each person unique. Even these small differences are very useful for identification and kinship analysis. For example, forensic scientists compare DNA from crime scene samples to suspects’ DNA to find matches, and paternity tests compare a child’s DNA to the mother’s and potential father’s DNA to confirm parentage. A child’s DNA is 50% inherited from each parent, so a parent and child share about half their DNA exactly in common (and will be ~99.95% similar overall, considering all human DNA), a much higher overlap than between two random, unrelated people.
DNA sequence comparisons allow scientists to quantify relatedness. For instance, imagine we compare a short stretch of DNA from two individuals. If the sequences are 10 DNA bases long and differ at only one position, they would be 90% similar (9 out of 10 bases match). Generally, close relatives (like siblings or parent-child) have very few sequence differences between them, whereas unrelated individuals have a bit more. In humans, even unrelated people show only very slight sequence differences, consistent with our recent shared ancestry as one species. Scientists can calculate the percent similarity between individuals’ DNA to infer relationships or identify individuals. Tools like DNA profiling take advantage of unique genetic markers in that 0.1% difference for identification. In genealogy and ancestry research, companies compare thousands of DNA sites across genomes: people who share more of these DNA markers likely have a recent common ancestor in their family tree.
DNA and Evolutionary Relationships Among Species
DNA also provides key evidence for evolutionary relatedness among species. When we compare the DNA of different species, we find that closely related species (like two species of birds, or humans and apes) have very similar DNA sequences, whereas distantly related species (like a human and a fish) have more DNA differences. These differences accumulate over evolutionary time. Scientists use DNA sequence data to reconstruct evolutionary histories, which are often represented as phylogenetic trees.
Phylogenetic trees (also called evolutionary trees) are diagrams that look like tree branches, showing lines of descent from common ancestors. The base (root) of the tree represents an ancestral lineage, and each branch point (node) represents a split where one lineage diverged into two lineages, giving rise to two species. The tips of the branches are present-day species. In a phylogenetic tree, species that share a recent common ancestor are positioned as close neighbors on the tree.
Historically, these evolutionary trees were built using morphological traits, but today scientists primarily use DNA to construct phylogenetic trees. The principle is that genetic similarity reflects evolutionary kinship: for example, if Species A’s DNA is 98% identical to Species B’s DNA, but only 90% identical to Species C’s DNA, then A is more closely related to B than to C. Molecular data have made it possible to quantify these relationships with high precision, sometimes overturning earlier assumptions of evolutionary relatedness based on morphological similarity alone.
History of Genetic Similarities: Wilson, Sarich, Sibley, and Ahlquist
To investigate how birds are related to one another, a biologist of the 1950s would have carefully studied their anatomical similarities and differences. But today, a scientist working on the same problem could also use the very instructions from which that anatomy was built: its genetic code. DNA sequences form the hereditary links between generations, so it is no surprise that scientists investigating evolutionary relationships have sought to get closer and closer to the DNA that underlies those relationships. However, reading the genomes of entire organisms did not fall immediately from the discovery of DNA in the 1950s. In small steps, scientists came closer to their target.
Scientists first began to zoom in on gene sequences by studying the products of DNA: proteins. After all, if two species are closely related, they should have similar gene sequences, which should then make similar proteins. So before the 1970s, proteins were used as stand-ins for genes in studying evolution.
Testing similarity using antibodies
One way that researchers assessed protein similarities was by harnessing the immune system’s ability to recognize foreign proteins. For example, the immune system of a rabbit will recognize a human protein as foreign and will mount an attack against it by making antibodies specific to that protein. If those same rabbit antibodies are exposed to a similar protein — from a chimpanzee, perhaps — they will attack it as well. The more similar the proteins from the two species (human and chimpanzee) are, the stronger this second attack will be. Although variations of this technique were being employed as early as 1904, more sensitive protocols were developed in the 1960s. These more sensitive techniques revealed the remarkable similarity between the proteins of humans and those of other great apes. Expanding upon the work of others and making the assumption that fewer protein differences corresponded to shorter times of separation, Vincent Sarich and Allan Wilson estimated that humans, chimpanzees, and gorillas shared a common ancestor only 5 million years ago — a much shorter length of time than was commonly accepted at the time.

Testing similarity using DNA
Scientists studying the chemistry of DNA moved even closer to actual sequences. Charles Sibley and Jon Ahlquist pioneered the use of DNA kinetics to investigate evolutionary relationships using a technique called DNA-DNA. Each DNA molecule is made of two strands of nucleotides. If the strands are heated, they will separate—and as they cool, the attraction of the nucleotides will make them bond back together again. To compare different species, scientists cut the DNA of the species into small segments, separate the strands, and mix the DNA together. When the two species’ DNA bonds together, the match between the two strands will not be perfect since there are genetic differences between the species — and the more imperfect the match, the weaker the bond between the two strands. These weak bonds can be broken with just a little heat, while closer matches require more heat to separate the strands again.
DNA hybridization can measure how similar the DNA of different species is — more similar DNA hybrids “melt” at higher temperatures. When this technique was applied to primate relationships, it suggested that humans and chimpanzees carried DNA more similar to one another’s than to orangutans’ or gorillas’ DNA.

Modern DNA Sequencing
Advances in sequencing technology now allow scientists to rapidly read entire genomes, not just small fragments of DNA. Instead of focusing on individual genes, researchers can compare full genomes across species to measure similarity and identify where differences occur. These comparisons confirm that humans and chimpanzees share nearly 98.8% of their DNA, while also pinpointing the specific regions where they differ. Genomic sequencing has also revealed finer details of evolutionary history, such as showing that chimpanzees and bonobos are more closely related to each other than either is to humans. Although genome data are powerful tools for understanding evolutionary relationships, they are best interpreted alongside anatomical, behavioral, and developmental evidence to build a fuller picture of relatedness.

Attribution: This section and figures were adapted from Understanding Evolution by the UC Museum of Paleontology and shared under CC BY-NC-SA 4.0
In constructing phylogenetic trees, researchers often compare DNA sequences for specific genes or even entire genomes across species. They align these sequences to count differences and similarities, then use computer algorithms to infer the most likely branching patterns of evolution that explain the DNA data. The resulting tree diagrams show hypotheses of how species are related and in what order they diverged from their common ancestors. Consistently, DNA-based trees have reinforced the concept of the unity of life: all living organisms share some DNA sequence similarities, reflecting that all life on Earth is connected through ancient evolutionary ancestry. For instance, humans share not only ~99% of our DNA with chimpanzees, but also ~85% with mice, ~60% with fruit flies, and even ~50% with bananas, if we compare common genes. While comparing the entire genome can give us a more accurate sense of how evolutionarily related two species are, comparing individual genes can not only help us understand the evolutionary relationship of species, but the genes themselves.
Example: Calculating DNA Sequence Similarity
At the most basic level, scientists calculate percent sequence similarity to determine how alike two individuals are. This example walks through the process step by step using two hypothetical DNA sequences (20 base pairs each) from Individual A and Individual B.
-
Align the sequences and identify mismatches. Line up the two sequences so that each nucleotide position can be compared directly. For our example sequences, an alignment looks like this:
A: ACCTGTCGATCGTTACCGAT B: ACCTGACGATCATTACCTAT ^ ^ ^
Here, the letters A and B label the sequences from Individual A and Individual B. The arrows (^) underneath mark the positions where the two sequences have different bases. In this example, there are three mismatches (at the positions indicated by ^).
-
Count matches and mismatches. Out of 20 positions, 3 bases do not match between the two sequences (mismatches), which means the remaining 17 positions have identical bases (matches). We can clearly label these: 3 mismatches vs. 17 matches in a total of 20 base pairs.
-
Calculate the percent similarity. Percent similarity is determined by the number of matching bases divided by the total number of bases, multiplied by 100%.
The calculation is:
[latex]\text{Percent similarity} = \frac{\text{number of matches}}{\text{total bases}} \times 100\%[/latex]
[latex]\frac{17}{20} \times 100\% = 85\% \[/latex]
So, Individuals A and B have an 85% DNA sequence similarity in this 20-base segment.
References and Online Resources
- Understanding Evolution (UC Berkeley Museum of Paleontology) evolution.berkeley.edu.
- Sarich, V. M., & Wilson, A. C. (1967). Immunological time scale for hominid evolution. Science, 158(3805), 1200–1203. https://doi.org/10.1126/science.158.3805.1200
- Sibley, C. G., & Ahlquist, J. E. (1984). The phylogeny of the hominoid primates, as indicated by DNA–DNA hybridization. Journal of Molecular Evolution, 20(1), 2–15. https://doi.org/10.1007/BF02101938
- OpenStax. (2018). Concepts of Biology (Section 12.2: Determining Evolutionary Relationships). OpenStax CNX. Shared under a Creative Commons Attribution license. Retrieved from openstax.org
- Yousaf, A., Liu, J., Ye, S., & Chen, H. (2021). Current progress in evolutionary comparative genomics of great apes. Frontiers in Genetics, 12, 657468. https://doi.org/10.3389/fgene.2021.657468