What is an ancestral marker?
mtDNA is a circular chain consisting of 16,569 pairs of nucleotides. Let’s unwind the DNA double helix and take a closer look at its genetic code.
DNA consists of two chains of nucleotides, designated A, C, T, and G. “A” is always linked to “T”, and “C” is always linked to “G” on the opposite chain. In this diagram, we will take a closer look at a short segment of mtDNA, namely locations 1 to 45. The unique combination of nucleotides in the chain is called a “genetic code” and holds genetic information.
Ancestral markers are “mutations”, little changes or “hiccups” that occur in the genetic code of the mtDNA. There are many types of mutations, but the type of mutation most commonly found in mtDNA is called a “SNP” (single nucleotide polymorphism). A SNP mutation occurs when a single nucleotide is replaced with a different nucleotide. For example, in this diagram, the “T” at location 40 is replaced by a “G”.
This mutation is documented as follows:
When you test your mtDNA, your results report will document the mutations that you carry in your mtDNA. Let’s take a look at a sample report:
In this report, “Location” refers to the locations on the mtDNA where a mutation has been detected. “Mutation Type = Substitution” means that a nucleotide has been substituted by a different nucleotide. “Nucleotide Change” indicates what was substituted. Let’s take a look at the first mutation in the list. Location “16126″ means that a mutation has been detected at location “16126″. T>c means that a “T” has been replaced by a “c” at this location.
A second way to look at your results is to take a look at the actual sequence. In this example, the sequence shows the results for the HVR1 region, from locations 16001 to 16520. Remember, only one of the two chains in the pair is shown when reporting the sequence!
Fifty nucleotides are listed per line from left to right. In this example, the first line shows the results for locations 16001 to 16050, the second line shows the results for 16051 to 16100, and so on. Mutations in the sequence are highlighted in pink. In this sequence, the nucleotide at location 16126 is highlighted in pink, indicating that a mutation is detected here, and the nucleotide has been replaced by a “c”.
The unique set of mutations that you carry in your mtDNA holds information about your maternal ancestry. Next, we will talk about the technology used to detect mutations in your mtDNA.
A basic understanding of DNA testing techniques will help you to understand the science behind DNA ancestry testing.
DNA Testing 101
The two most common methods used to detect mutations in mtDNA are 1) DNA Sequencing, and 2) SNP Testing. Let’s talk about each one and how they work.
1. DNA sequencing.
DNA sequencing is a special process which is used to read the chain of nucleotides in a specific segment of your DNA, much like reading a book.
This technology allows the lab read the entire genetic code of a whole section of your mtDNA. The following report is an example of the results of a sequencing test in the HVR1 region of an individual’s mtDNA.
As you can see, all of the nucleotides in HVR1 region (locations 16001 to 16520) have been decoded. All mutations detected in the sequence are indicated in pink.
The benefit of DNA Sequencing technology is that it can accurately read entire lengths of your DNA. The limitation of DNA Sequencing technology is that only approximately 400 to 500 nucleotides can be read at a time (in one test).
This technology is used for testing the HVR1 and testing the HVR2 (D-Loop) region of your mtDNA. The HVR1 region is approximately 500 nucleotides in length (spans location 16000 to 16569). The HVR2 region is approximately 400 nucleotides in length (spans locations 1 to 400).
DNA Sequencing technology is the best test method to detect mutations in the HVR1 and HVR2 regions (D-Loop):
However, DNA Sequencing is not the best method for examining mutations in the Coding Region of your mtDNA because the Coding Region is extremely long, over 15,000 nucleotides in length.
Since each sequencing reaction can only test approximately 500 nucleotides at a time, a lot of reactions would be required in order to sequence the entire coding region, making it impractical and costly. Also, as discussed, frequency of mutations in the coding region is very low, making it impractical and unnecessary to sequencing every single nucleotide in the large coding region. Thus, DNA Sequencing is not the best way to look for mutations in the coding region.
2. SNP Test Panels
The second method to detect mutations in the mtDNA is called “SNP” testing. Unlike DNA Sequencing, this technique does not read the entire length of DNA. Instead, it targets specific nucleotides. Only the nucleotides that provide useful information are tested, and all other nucleotides are ignored. This is the best method for efficiently testing large regions of DNA.
For example, if the presence of a mutation at location 16223 is an important indicator that someone does not belong to Haplogroup H, then the laboratory will pinpoint and specifically test location 16223 to see if a mutation exists in this location.
This is like “sharp shooting”. Instead of testing an entire region of the DNA, we are specifically targeting exact locations and nucleotides that are important for answering a specific question.
SNP Test Panels are special handpicked panels of SNP markers which will answer a specific set of questions. For example, a mtDNA Haplogroup Backbone SNP Test Panel will examine all markers which will tell us which mtDNA Haplogroup an individual belongs to. We will talk more about specific SNP Test Panels later in this tutorial.
Next, we will discuss how mutations in our mtDNA allow us to trace our maternal ancestry.
We all have a unique pattern of SNP mutations in our mtDNA. Our SNP mutations can be used to trace our maternal ancestry in two ways: 1) Direct Comparisons, and 2) Haplogroup Determination. Let’s talk about each one in more detail.
1. Direct Comparisons:
By testing your mtDNA, you will discover the unique set of mutations that was passed down to you from your maternal ancestors along your direct maternal line. Your mtDNA “profile” is the unique set of mutations that you inherited from your own mother, and it is unique to your maternal ancestry. For example, all individuals living anywhere in the world today who are direct descendents of the same branch of the “haplogroup tree” as you, will have the same mtDNA profile as you (ie. you are linked through a common maternal ancestor). Likewise, if someone has a completely different mtDNA profile as you, that means that he/she definitely did not descend from the same maternal line or haplogroup as you (ie. you are not directly linked to the same haplogroup on your maternal line). Once you test your mtDNA markers, you can:
The more regions of your mtDNA that you test, the more precise the results of your comparison will be. Your mtDNA contains several regions, namely, the HVR1, HVR2 and coding regions. Later, we will discuss the different types of mtDNA tests, and talk about which regions are examined by each test type.
2. Haplogroup Determination:
The unique set of mutations that you carry in your mtDNA allows you to track your “deep ancestry’, ie. your ancestry from tens of thousands of years ago and discover your mtDNA haplogroup.
SNP Mutations are small “mistakes” that occur naturally in your DNA. SNP mutations are rare, occurring at a rate of approximately one mutation every few hundred generations. However, once a mutation occurs, it acts as a “time-and-date-stamp”, because it is passed on to all future generations. Each mutation event can be linked to a time and place in history, and by testing the mutations in your mtDNA, you can retrace the history of your ancient ancestors.
Let’s take a look at how mutations can allow us to trace the path of our ancestors using the following hypothetical example:
As you can see from this diagram, whenever a new ”marker” occurs, it is passed down to all future generations. By studying all of the markers that an individual carries, we can tell them the story behind each marker, ie when did that marker first occur, and where did it occur. By knowing when and where each marker in your mtDNA occurred, we can trace the journey of your ancestors back in time.
We will be going over some case studies using real mutations to see how DNA mutations are used to trace ancestry. However, In order to understand the mtDNA mutations in the examples, you will need to have a basic understanding of the “Cambridge Reference Sequence". Next, we will go over the Cambridge Reference Sequence and the fundamental role that it plays in mtDNA markers.
Let's discuss the Cambridge Reference Sequence (aka CRS). The CRS is a fundamental part of mtDNA data analysis. A basic understanding of the CRS and how it is used in determining mutations will allow you to understand the role that mutations play in tracing your ancestry.
What is the CRS?
The CRS is the first human mtDNA that was ever fully sequenced and published. The work was performed by scientists at Cambridge University, and this groundbreaking study was officially published in 1981. Click here to view a copy of the original publication.
This publication represents the first time that the mtDNA was sequenced. The donor whose DNA was used for this ground-breaking project was of European descent and belonged to European mtDNA Haplogroup H.
Since this was the first mtDNA sequence ever published, this sequence was thereafter referred to as a “reference sequence” upon which all further mtDNA sequences from labs around the world are compared to. This original sequence eventually came to be known as the “Cambridge Reference Sequence” and all mtDNA which is sequenced, even today, is compared to the CRS.
Mutations are determined based on comparison with CRS
When we state that we have mutations in our mtDNA, we are actually showing the regions of our DNA which differ from the CRS. Let’s take a look at an actual mutation report:
In this report, the HVR1 region was tested, and 6 mutations were detected, indicating that this individual’s HVR1 region differs from the CRS at 6 different locations. Let’s take a look at the first mutation in the list: 16126 T>c. This means that the individual’s mtDNA is different from CRS at location 16126. It shows that CRS has a “T” at this location, but the person tested has a “C”.
Let’s look at the same results based on the sequencing report:
All of the letters in “black” are the same as CRS. All of the nucleotides in “Red” are different from CRS and are considered “mutations”.
We are all compared to the CRS, not the earliest human mtDNA!
The key point to remember is that when the results of mtDNA testing are used for genealogical purposes, the results are compared to the CRS and mutations are reported as “differences” between the results and the CRS.
This however, can lead to confusion for beginner genetic genealogists because instinctively, people usually think that when scientists look for mutations, they should be comparing our mtDNA to that of the earliest human DNA to see how our DNA has changed over time. However, that is not how the research community has decided to approach the mtDNA. The consensus within the scientific community is that mtDNA is always compared to CRS. Since this is the case, it is important for you to become familiar with how this “reverse” method is used to analyze our mutations and determine haplogroups.
The role of CRS in haplogroup determination:
Let’s take a look at the human mtDNA haplogroup tree. This is a phylogenetic tree which shows how all people living today descended from a common ancestor (mitochondrial eve) who lived in Africa over 150,000 years ago. Every person living today can trace his/her ancestry to a branch of this tree, called a “haplogroup”. The European individual whose mtDNA sequence is famously called the CRS is located at a distant branch of the tree as shown in the diagram below:
Where is the CRS located on the mtDNA haploplogroup tree?
Now, let’s take a look at how your mtDNA haplogroup is determined.
To determine your mtDNA haplogroup, always start with the CRS and move away.
Example #1: If you HAVE mutations at locations 263 and 7028, and DO NOT have mutations at locations 14766 or 16067 or 16298, then you belong to Haplogroup HV:
Example #2: If you HAVE mutations 263, 7028, 14766, 73, 11251, 16126, and 16069, and DO NOT have a mutation at 16294, then you belong to Haplogroup J.
Example #3: If you HAVE mutations 263, 7028, 14766, 73, 11719, 12705, 16223, 10873, 2352, and 150, then you belong to Haplogroup L3e:
Summary of procedure for determining mtDNA Haplogroups: To determine your haplogroup, always start from the CRS and move backwards in the tree to see which mutations you have and which ones you do not have. Your haplogroup is determined by the difference between your markers versus CRS.
What if I don’t have any mutations? If you do not have any mutations, that means that your mtDNA sequence (at least the part that was tested) is exactly the same as CRS. CRS belongs to a branch of Haplogroup H, so if you belong to Haplogroup H, chances are that you will not have too many mutations in comparison with CRS.
This concludes the basic overview of the CRS.
Next, in Part 3 of this tutorial, we will review the different mtDNA test types available and what each test type will tell you >>
Get Test »