Learn about Y-chromosome Haplogroup L
Learning Center» Paternal Ancestry (Y-DNA)» Y-DNA Haplogroups (SNPs)» Learn about Y-chromosome Haplogroup L

Alternative content

The ads below are provided by Google.
Publications
by Genebase Users

What is subclade testing?

Subclade testing can provide increased resolution of your placement on theY-DNA phylogenetic tree.  Before your subclade can be determined, you must first know what haplogroup you fall into.  Haplogroups are defined by a unique mutation event such as a single nucleotide polymorphism, or SNP.  These SNPs mark the branch of a haplogroup, and indicate that all descendents of that haplogroup at one time shared a common ancestor.  The Y-DNA SNP mutation has been passed from father to son over thousands of years.  Over time, additional SNPs may occur within a haplogroup, leading to a new lineage.  These new lineages are considered subclades of the haplogroup.  Each time a new mutation occurs, there is a new branch in the haplogroup, and therefore a new subclade.  By testing for the presence of SNPs that have been identified as being indicative of a subclade within a haplogroup, you can now determine which specific subclade you belong to within your previously determined haplogroup.

Origin of Y-DNA Haplogroup L

Approximately 40,000 years ago a large population of hunters began a slow migration that led to their dispersal eastward from south-central Asia along the Eurasion steppe.  Migration of this Eurasion clan was eventually blocked by three mountain ranges (Hindu Kush, the Tian Shan and the Himalayas) that met in the center of a region known as the Pamir Knot (see Figure 1), located in present-day Tajikistan.  Here the tribes of hunters split into two main groups; one group moved north into central Asia while the other moved south into Pakistan and the Indian subcontinent.  The group that moved south eventually settled the Indian subcontinent, and it is thought that one or more of the defining mutations for Haplogroup L (M11, M20, M22, M61, M185, M295) originated within this population about 30,000 years ago.  However, Haplogroup L has also been tentatively associated with the expansion of farming; if this is a valid hypothesis, it would therefore imply a non-Indian origin of Haplogroup L (Qamar et al. 2002).  Figure 2 illustrates the location of Haplogroup L in the Y-chromosome phylogenetic tree.


Figure 1.  Illustration of the migration followed from Y-chromosomal Adam to the origin of Haplogroup L in India.  The “X” on the map indicates the proposed origin of the mutation(s) that define Haplogroup L.


Figure 2.  Y-DNA phylogenetic tree starting with Y-chromosomal Adam and showing all 20 haplogroups labelled A to T.  Haplogroup L is indicated with the blue circle.


Geographical Distribution of Y-DNA Haplogroup L

The majority of L haplogroups can be found within the Indian subcontinent, and along with haplogroups H and R2 account for more than one-third of Indian Y-chromosomes (Kivisild et al. 2003a).  Haplogroup L also occurs in the Middle East, and at low frequencies in Central Asia, along the Mediterranean coast of Europe, and in western Europe (Figure 3).  Table 1 summarizes the distribution and frequency of the L haplogroup.


Figure 3.  Geographical distribution of Haplogroup L.  The black pies indicate the frequency of the haplogroup in the area in which it is located on the map.  It is clear that Haplogroup L is most widespread and frequent in India.


Table 1. Estimated frequencies of Haplogroup L in various countries and social groups.



Haplogroup L in India

Due to its geographical positioning at the crossroads of Africa, the Pacific, and West and East Eurasia, India has served as a major corridor for the dispersal of humans out of Africa. There were five possible waves of gene flow to India that have contributed to present-day genetic diversity: 1) Ancient Paleolithic migration during initial colonization of Eurasia from 25,000 to 60,000 years ago; 2) Early Neolithic migration approximately 10,000 years ago, probably of proto-Dravidian speakers from the eastern horn of the Fertile Crescent; 3) Arrival of Indo-European speakers approximately 3500 years ago; 4) Further dispersal of Austro-Asiatic and Tibeto-Burman speakers with ties to east-southeast Asia; and 5) Recent gene flow from conquerors from central Asia and European colonizers.  All of these events have lead to the immense cultural, linguistic and genetic diversity in present-day India, but have also complicated the ability of scientists to unravel the genetic history of these populations.  Typically, genetic variation is correlated with geography.  However, genetic variation may be associated with alternative factors if they had a greater influence on the dispersal of human populations relative to geography.  Other factors that may have been important in the distribution of groups include language and religion.  A deeper analysis of the L haplogroup through subclade testing will be very useful for understanding not only the basis of genetic diversity of Indian populations, but also the origins of the complex social structure within India.

The populations of India can be classified as either tribal or nontribal.  The tribal groups constitute approximately 8% of the Indian population, and are thought to represent the original inhabitants of India, arriving before Indo-European speakers.  The tribal communities speak > 750 different dialects that can be classified into one of three language families: Austro-Asiatic, Dravidian, and Tibeto-Burman.  Nontribals typically speak languages belonging to Dravidian or Indo-European families.

Maternal lineages of present day Indians seems to have derived from eastern Asian populations whereas paternal lineages shows more variation among populations and may stem from eastern European or western Asian populations.  Previous studies have concluded that Indian caste and tribal Y-chromosomes derive from the same Pleistocene genetic heritage, with only limited and recent genetic flow from external sources (Kivisild et al. 2003).  However, this theory implies that paternal lineages of caste groups originated within India, and this does not agree with non-genetic data nor previous studies that have detected genetic differentiation between caste and tribal groups (Basu et al. 2003).  Cordaux et al. (2004) tested whether Indian caste paternal lineages are derived from local ancestors (the tribal groups) or from other Eurasian sources, and they found several lines of evidence to suggest that caste populations were derived from a recent and rapid spread of Asian Y chromosomes over the Indian subcontinent, possibly from the Indo-European speaking pastoralists that migrated from central Asia 3500 years ago. First, they found that Haplogroup L is one of the four most frequently detected Y lineages in caste groups, and occurs at a significantly greater frequency in caste groups relative to tribal groups.  Second, Indian caste groups were found to be more similar to central Asians than to Indian tribal or other Eurasian groups.  Since caste groups developed only within the last 3500 years, it is assumed that this is a short period of time for observing such large differences in the Y-chromosome if they were derived from tribal groups.  Interestingly, in the north there are four caste groups, but in the south there was a fifth class introduced to integrate the local people into the caste system (they were formerly referred to as “untouchables”).  In the south, lower caste groups are more similar to Asians whereas higher caste groups are more similar to Eurasians.  Because analysis of mtDNA suggests a common maternal origin of tribal and caste groups, Cordaux et al. (2004) developed the hypothesis that it was predominantly male Indo-Europeans that migrated to India and then mated with local tribal females.  Alternatively, a lack of Indo-European mtDNA may reflect hypergyny (high ranking caste males mating with lower ranking tribal females) or female infanticide.  Both of these activities would have served to reduce the mtDNA contribution of the high-ranking Indo-European females.

However, Sahoo et al. (2006) disagrees with theses conclusion.  They found that Indo-European and Dravidian speaking populations both have a high frequency of L1 and they did not find consistent differences between caste and tribal pools.  They conclude that a more parsimonious explanation is a pre-Indo-European, pre-Neolithic presence of the L haplogroup in India.  The absence of L lineages within Indo-European speakers from Bihar, Orissa and West Bengal supports their conclusion that Y haplogroups in India are associated primarily with geography and not linguistic or cultural determinants.

Haplogroup L in the Middle East

The L haplogroup is also detected in Turkey with the highest occurrence in the eastern region (Cinnioglu et al. 2004).  The Turkish L lineages lack the M27 mutation that characterizes Indian and Pakistani L lineages.  The Armenian haplotype (based on six STR) seems to match the most common Turkish counterparts, although this has not been verified with M27 data.  Differences in modal haplotype of L between Caucasus (Turkey) and India suggests independent expansions from two distinct founder populations.

Haplogroup L is common and widespread in Pakistan and is found in approximately 14% of the population (Qamar et al. 2002).  While the L lineage is one of the top five most common haplogroups in Pakistan, its distribution differs from the other four.  Qamar et al. (2002) estimated the time to most common recent ancestor to be 4000 to 14000 years ago.  The spread of this lineage might therefore have been associated with the local expansion of farmers since the estimated age corresponds to the early Neolithic period (the start of cultivation, animal domestication, and cultural evolution). 

There is also evidence of a Parsi-specific lineage of the L Haplogroup (occurrence of L in Parsis is 18%).  Parsis means “from Iran”, and as the name suggests they followed the Iranian prophet Zoroaster as he migrated to India after the collapse of the Sassanian empire in 7th century AD.  The Parsis settled in Gujarat, India and later migrated to Mumbai, India and Krachi, Pakistan.  The estimated time to most recent common ancestor for the Parsi-specific lineage of Haplogroup L is 600 to 4500 years ago.  This estimate is consistent with the migration of a small number of lineages from Iran and gene flow from the surrounding area.


The subclades of Y-DNA Haplogroup L

Currently, Haplogroup L shows a total of seven separate lineages: L*, L1, L2*, L2a, L2b, L3*, and L3a.  Since few studies have included subclade testing for Haplogroup L, there is currently limited information about the distinct lineages within this haplogroup.  The tradeoff to this is that any new information on these subclades may play an important role in advancing understanding of the genetic and social history of the populations that settled India.  The information that is known about the specific subclades is summarized below, in Table 2.

Recently (in 2006), a new SNP, PK3, was detected to identify subclade L3a, and this has resulted in some interesting data.  Within Pakistan, this subclade is found solely within the Kalash population located in remote valleys of the Hindu Kush Mountains in north Pakistan.  This population has been found to cluster with the Yadhavas, a Dravidian speaking group from south India.  Does this indicate shared European ancestry?  Possibly.  Examination of variation within the Y-chromosome provided an estimated time to most recent common ancestor of 1400 to 8100 years before present, coinciding with the putative invasion of the Indo-Pak subcontinent by Indo-European speaking groups from Central Asia.  Additional tests of the L3a subclade in India will help to resolve this story, and further indepth analysis of all the Haplogroup L subclades will also help to improve understanding of the early dispersal patterns of humans into the Middle East and India.

Table 2.  Summary of the information currently known for the subclades of Haplogroup L.

SNP

Haplogroup L subclade

Comments

M20

L*

- Mostly found in Pakistan (13.1%; Sengupta et al. 2006)



- L* is also found in Lebanon where it is highest in the Druze followed by Christian and then Muslim populations (Zalloua et al 2008)

M76

L1

- L1 is the most common subclade in India (6.3%, Sengupta et al 2006)



- This subclade likely arose within the boundaries of present-day India



- Current data suggest that L1 underwent early diversification in South India then expanded toward the peripheral regions of the country



- The most recent estimate is that the expansion time of this subclade spans at least the early Halocene, therefore well before the Neolithic and the spread of farming



- Similarities in phylogeography and microsatellite varition with Haplogroups R1a1 and R2 suggest a common demographic history



- Seems to be more predominant in Dravidian speakers, although this is not a consistent observation

M317

L2*

- Found to occur in southern Europe and Anatolia

M349

L2a

- This subclade is sometimes called "Mediterranean" as it occurs in southern Europe from Portugal to Turkey

M274

L2b

- Details of this subclade will be added once they become available

M357

L3*

- Occurs with intermediate frequency in Pakistan, in the Burusho and Pashtun populations

PK3

L3a

- In Pakistan, found solely in the Kalash population (23%; Mohyuddin et al. 2006)



How the Subclades of Y-DNA Haplogroup L are determined

1. Obtain results from a Y-DNA STR test to predict your haplogroup.
2. Confirm your haplogroup with a Y-DNA Haplogroup Backbone SNP test.  You should be positive for M11, the SNP that is used to confirm Haplogroup L in the Y-DNA Haplogroup Backbone SNP Test panel.
3. Once your haplogroup has been confirmed as L, you can then obtain the Y-DNA Haplogroup L Subclade Test.  Table 2 provides a list of the seven SNP markers used in this panel, including the location of the SNP, the specific mutation, and the subclade that is defined by each SNP.
4. Identify the location of your SNPs on the phylogenetic tree to determine your subclade.  Figure 4 illustrates a step-by-step procedure to determine your subclade, and can be downloaded and printed for your use.


Table 3.  The SNP markers that are included in the Y-DNA Haplogroup L Subclade Test panel with information on the location of the SNP, the specific mutation, and the subclade that each SNP identifies.

Location of SNP

Mutation

Haplogroup L subclade

M20

A>G

L

M76

T>G

L1

M274

C>T

L2b

M317

-GA

L2

M349

G>T

L2a

M357

C>A

L3

PK3

T>C

L3a




Figure 4.  This flowchart will walk you through the process of using your SNP data to determine the subclade that you belong too.  To begin the process, start at the decision with the red circle.  Do you have the M20 SNP?  If you have already completed the Y-DNA Haplogroup Backbone test and have been confirmed as Haplogroup L, you will have this mutation.  If you do not have the M20 SNP, you are not part of Haplogroup L.  The next SNP to check is M76.  If you have this mutation, you are classified into subclade L1.  If you lack M76, determine whether you have SNP M317.  If you have M317, we know that you are in subclade L2, but there are three possibilities within this subclade.  If you possess SNP M274, you are in subclade L2b, whereas if you have M349, you are in subclade L2a.  If you don’t possess any additional SNP mutations at this point, you are considered part of subclade L2*.  Let’s follow the path if you did not possess the M317 SNP.  There are two more SNPs that are tested in this panel.  If you carry M357, you are part of subclade L3.  Possession of SNP PK3 assigns you to subclade L3a, whereas if you do not have this mutation you are part of subclade L3*.  If you do not have M357, you are classified as subclade L*.


Geographical Distribution of the Subclades of Y-DNA Haplogroup L

Very little is currently known about the frequency of distribution of the Haplogroup L subclades.  Part of the reason for this lack of information is that most population assessments were conducted prior to the discovery of SNP markers that were found to define the discrete subclades within this haplogroup.  Now that it is possible to detect a more refined placement within the haplogroup, there will no doubt be new research shortly that we can compile and add to Table 2 (the table has been included again below for ease of reference).

Table 2.  Summary of the information currently known for the subclades of Haplogroup L.

SNP

Haplogroup L subclade

Comments

M20

L*

- Mostly found in Pakistan (13.1%; Sengupta et al. 2006)



- L* is also found in Lebanon where it is highest in the Druze followed by Christian and then Muslim populations (Zalloua et al 2008)

M76

L1

- L1 is the most common subclade in India (6.3%, Sengupta et al 2006)



- This subclade likely arose within the boundaries of present-day India



- Current data suggest that L1 underwent early diversification in South India then expanded toward the peripheral regions of the country



- The most recent estimate is that the expansion time of this subclade spans at least the early Halocene, therefore well before the Neolithic and the spread of farming



- Similarities in phylogeography and microsatellite varition with Haplogroups R1a1 and R2 suggest a common demographic history



- Seems to be more predominant in Dravidian speakers, although this is not a consistent observation

M317

L2*

- Found to occur in southern Europe and Anatolia

M349

L2a

- This subclade is sometimes called "Mediterranean" as it occurs in southern Europe from Portugal to Turkey

M274

L2b

- Details of this subclade will be added once they become available

M357

L3*

- Occurs with intermediate frequency in Pakistan, in the Burusho and Pashtun populations

PK3

L3a

- In Pakistan, found solely in the Kalash population (23%; Mohyuddin et al. 2006)




Phylogenetic Tree for the Subclades of Y-DNA Haplogroup L

There are a total of seven distinct subclades within Haplogroup L that have currently been detected.  Figure 5 illustrates the phylogenetic tree of Haplogroup L with the subclades.  For an illustration of how Haplogroup L fits into the entire Y-chromosome phylogenetic tree, you can refer back to Figure 2.


Figure 5.  Phylogenetic tree indicating the subclades of Haplogroup L.  The SNP markers that define each subclade are indicated on the tree.  As the legend states, the M11 marker (indicated in green) is used by the Y-DNA Haplogroup Backbone Test to confirm that you are a member of Haplogroup L, whereas the markers in yellow are those that are included in the Y-DNA Haplogroup L Subclade Test panel.


Resources

Bamshad et al. (2001) Genetic evidence on the origins of Indian caste populations. Genome Research 11:994-1004.

Basu et al. (2003) Ethnic India: A genomic view with special reference to peopling and structure. Genome Research 13: 2277-2290

Behar et al. (2004) Contrasting patterns of Y chromosome variation in Ashkenazi Jewish and host non-Jewish European populations. Human Genetics 114:354-365.

Cinnioglu et al. (2004) Excavating Y-chromosome haplotype in Anatolia. Human Genetics 114:127-148.

Cordaux et al. (2004) Independent origins of Indian caste and tribal paternal lineages. Current Biology 14:231-235.

Deng et al. (2004) Evolution and migration history of the Chinese population inferred from Chinese Y-chromosome evidence. Journal of Human Genetics 49:339-348.

Firasat et al. (2007) Y-chromosomal evidence for a limited Greek contribution to the Pathan population of Pakistan. European Journal of Human Genetics 15:121-126.

Karafet et al. (2008) New binary polymorphisms reshape and increase resolution of the human Y chromosomal haplogroup tree. Genome Research DOI: 10.1101/gr.7172008

Kivisild et al. (2003a) The genetic heritage of the earliest settlers persists both in Indian tribal and caste populations. American Journal of Human Genetics 72:313-332.

Kivisild et al. (2003b) The genetics of language and farming spread in India. In Examining the Farming / Language Dispersal Hypothesis.

Mohyuddin et al. (2006) Detection of novel Y SNPs provides further insights into Y chromosomal variation in Pakistan. Journal of Human Genetics 51:375-378.

Nebel et al. (2001) The Y-chromosome pool of Jews as part of the genetic landscape of the Middle East. The American Journal of Human Genetics 69:1095-1112.

Qamar et al. (2002) Y-chromosomal DNA variation in Pakistan. American Journal of Human Genetics 70:1107-1124.

Ramana et al. (2001) Y-chromosome SNP haplotypes suggest evidence of gene flow among caste, tribe, and the migrant Siddi populations of Andhra Pradesh, South India.  European Journal of Human Genetics 9:695-700.

Regueiro et al. (2006) Iran: Tricontinental nexus for Y-chromosome driven migration. Human Heredity 61:132-143.

Sahoo et al. (2006) A prehistory of Indian Y chromosomes: Evaluating demic diffusion scenarios. Proceedings of the National Academy of Science 103:843-848.

Sengupta et al. (2006) Polarity and temporality of high-resolution Y-chromosome distributions in India identify both indigenous and exogenous expansions and reveal minor genetic influence of central Asian pastoralists. American Journal of Human Genetics 78:202-221.

Shet et al. (2004) Reconstruction of patrilineages and matrilineages of Samaritans and other Israel populations from Y-chromosome and mitochondrial DNA sequence variation. Human Mutation 24:248-260.

Shlush et al. (2008) The Druze: A population genetic refugium of the Near East. PLOS One 3:e2105.

Spencer-Wells et al. (2001) The Eurasian Heartland: a continental perspective on Y-chromosome diversity. Proceedings of the National Academy of Science 98:10244-10249.

Thamseem et al. (2006) Genetic affinities among the lower castes and tribal groups of India: inference from Y chromosome and mitochondrial DNA. BMC Genetics 7:42

Thangaraj et al. (2003) Genetic affinities of the Andaman Islanders, and vanishing human population. Current Biology 13:86-93.

Underhill et al. (2001) The phylogeography of Y chromosome binary haplotypes and the origins of modern human populations. Annals of Human Genetics 65:43-62.

Weale et al. (2001) Armenian Y-chromosomal haplotypes reveal strong regional structure within a single ethno-national group. Human Genetics 109:659-674.

Zalloua et al. (2008) Y-chromosomal diversity in Lebanon is structured by recent historical events. American Journal of Human Genetics 82:873-882.


 

Need to cite this tutorial in your essay, paper or website? Use the following format:

Learn about Y-chromosome Haplogroup L. Genebase Tutorials. Retrieved September 18, 2014, from http://www.genebase.com/learning/article/13
Test your DNA markers today!
Get Test »
  • DNA tests starting from only $119
  • Search for immediate family lines
  • Receive instant match notifications when new matches are found
The ads below are provided by Google.
Other Tutorials
The Y-DNA SNP Haplogroup Backbone Test Panel contains 19 SNP markers throughout the Y-DNA. These 19 SNP markers are the defining markers for an individual’s Y-DNA haplogroup.
Your Y-DNA haplotype is the specific set of results obtained after testing a set of STR markers on your Y-DNA.
The Y-DNA Test examines several different STR Marker Types.
Find out what's new in Version 2 of the I Subclade Test Panel.
As the research in I subclades progresses, the scientific community routinely renames existing subclades to accommodate rapid growth of the Y-DNA phylogenetic tree.
Learn how Y-DNA Haplogroup G helped shape present day Middle Eastern societies and how it plays a significant role in the peopling of modern day India.
Individuals who have taken the Haplogroup R Subclade test may benefit from selectively testing newly discovered SNPs that are relevant to their particular subclade.
Discover the different types of genetic markers found in the Y-DNA and how it allows us to trace our paternal lineage.
Dates of discovery for SNPs that define subclades downstream of R1b (M343+) are listed.
Unlike all of the other chromosomes, the Y-Chromosome is unique because it is passed down relatively unchanged along the male lineage and thus holds valuable information about a male’s ancestry.
DYS464 is an unique Y-DNA STR marker which is known to have 4 to 7 alleles (a to d for 4 or a to g for 7).
Our discussion will cover human history that dates back more than 65,000 years (65kya) and encompasses a large number of major empires and events in Asian history.
MRCA stands for “Most Recent Common Ancestor”. When comparing two individuals, the MRCA is the most recent ancestor from which the two individuals descended.
With strong traces in Northern Europe, this group has made a great impact in Europe, even playing a large role in Viking ancestry.
DNA Haplogroup E is the most prominent group for individuals of African descent.
The majority of Y-DNA haplogroup L can be found within the Indian subcontinent, accounting for a large proportion of Indian Y-chromosomes.
Haplogroup O, defined by SNP marker M175, is thought to have appeared in East Asia approximately 35,000 years ago. Today, Haplogroup O can be detected across Asia and Oceania.
As research into the R subclades progresses at a rapid pace, the scientific community routinely renames existing subclades to accommodate the rapid growth of the Y-DNA phylogenetic tree.
Y-DNA STR markers available at Genebase and the corresponding motifs used for allele designation in Version 3.5.
Learn how to compare Y-DNA markers between 2 different individuals.
Learn about the steps are involved to obtain your Y-DNA haplotype.
Y-DNA Haplogroup J has strong Middle Eastern roots and has played a large part in shaping populations throughout Europe.
Commercial DNA testing laboratories follow different nomenclature for determining their marker values. The only accurate and reliable method to determine conversions required between different...
People whose ancestors are from the western coast of Europe often share in common a small group of Y-Chromosome STR markers. The group of Y-Chromosome markers which are frequently found in western...
It's the dominant group of Europe, playing one of the largest roles in shaping modern day European populations.
Y-DNA Haplogroup Q is widespread at low frequencies throughout the Middle East, Asia and Siberia, and at high frequencies in the Americas.
As research into the J subclades progresses at a rapid pace, the scientific community routinely renames existing subclades to accommodate the rapid growth of the Y-DNA phylogenetic tree.
Y-DNA STR markers mutate at a rate of approximately one mutation every 20 generations. The relatively rapid mutation rate of STR markers compared to the slow mutation rate of SNP markers makes STR...
A number of STR markers can be tested on the Y-DNA. The more markers that are tested, the more discriminating the matches when comparing to other individuals.