Learn about Y-DNA Haplogroup R

Alternative content

The ads below are provided by Google.
Publications
by Genebase Users

Y-DNA Haplogroup R, “Spanning the Seven Seas”

All males can trace their Y-DNA lineage back to a theoretical Y-DNA prototype, which originated in Africa and is thought to have migrated out of Africa over 60,000 years ago (60kya).   Although the Y-DNA is usually inherited from father to son without any changes, occasionally differences arise via mutations.  Since these mutations or variations add up through generations, the more differences that are found when comparing DNA, equals more time elapsed and more genetic distance.  Armed with information about the rates, types and number of variations we can create lineage maps or phylogenetic trees and make calculated estimates to trace our roots back to our forebears (e.g. to find the time to most recent common ancestor, TMRCA aka coalescence).


Figure 1.  Figure of Trees.  The first figure is an African acacia tree silhouette and symbolically shows the root of human males in Africa with their subsequent variation and flourishing in the rest of the world.  Next to this is a figure of the major Y-DNA haplogroups, with their phylogenetic relationships.



Ancestral Markers

The Y-DNA contains two main types of ancestral markers:

  1. SNPs (single nucleotide polymorphisms). SNPs are a change in a single nucleotide in a chromosome and occur infrequently; once they occur they are stable and typically define a whole chromosome and become its signature.
  2. STRs (short tandem repeats, aka microsatellites) (see Figure 2). STRs change by the number of repeats and change at a much faster rate than SNPs.

By testing the combination of SNPs and STRs in our Y-DNA, we can gain information on our paternal ancestry, ranging from ancient history (thousands and tens of thousands of years ago) with the much slower mutating SNPs, to recent history (100-1000 years ago) with faster mutating STRs.  More simply, SNPs allow us to track ancient or deep ancestry, while STRs allow us to track recent ancestry in the range of immediate family history over several generations and the relatively modern use of surnames. (see Figure 3). 


Figure 2.  The Y-DNA is shown as a cytogenetic banded chromosome or ideogram.  The shorter p-arm and a portion of the longer q-arm of the Y-DNA harbor most of the variation as indicated below.  The centromere constriction divides the two arms. 




Figure 3.  A schematic timeline is shown with estimations of ancestry determined through STR and SNP variation shown above.   BC/AD marks the division at the ‘common era’ beginning ~2,000 years ago (kya).



Y-DNA Haplogroups

Haplogroups are groups or a population derived from a common ancestor.  Y-DNA Haplogroups are defined by slowly evolving SNPs, and each SNP identifies a particular paternal haplogroup or branch of the Y-DNA phylogenetic tree.  (Note: mtDNA SNPs are used to determine haplogroups for maternal lineages). 

By contrast, the faster changing STRs are employed to determine haplotypes for the Y-DNA, where haplotypes are defined as a collection of variations in STR markers observed on the Y-DNA and can be thought of as a signature, one which tracks more recent genetic history.  Frequent haplotypes, commonly known as "modal haplotypes" can often be associated with defined populations and geographical regions, and can be informative or predictive of haplogroups that also show geographical preferences.  For example, from your haplotype determined through the Genebase Y-DNA STR Marker Test, you may already have a prediction of deeper genetic origins and a prediction of your Y-DNA haplogroup.            

There are 20 major Y-DNA haplogroups (designated by the letters A through T) stemming in a branching fashion from the Y-DNA prototype, aka “Y-DNA Adam” (haplogroup A), which may be seen as the root or trunk of the tree (see Figure 4).  Each branch and haplogroup after “Y-DNA Adam” is defined by a novel SNP or genetic change.  The Genebase Y-DNA backbone SNP Test Panel is used to determine Y-DNA haplogroups and additional panels are available to further resolve Y-DNA lineage into sub-haplogroups or subclades.


Figure 4.  This phylogenetic tree is presented in a typical inverted fashion, with the A-T haplogroups defined by SNP markers as they branch from the root (enter).  In order to identify your personal haplogroup, simply follow the branches from the ‘enter’ point with the SNPs identified, here exemplified in green for Haplogroup R.        



The Haplogroup R story …so far

Early origins
Y-DNA Haplogroup R is perhaps the most prominent Y-DNA lineage on Earth today.  It is the pre-eminent Y haplogroup in Europe, the U.S. and India. While the exact location for the origin of Haplogroup R is under debate, its general placement in West-Central Asia was followed by expansion in all directions, stretching to the ‘Seven Seas’.  The date for its origin is in the Paleolithic Era, 35-40kya, and populations harboring this Y-DNA variant survived the Ice Ages to repopulate much of Europe and Asia.  Migration episodes included excursions to Africa, South East Asia and in recent eras, the invasion of and immigration to the Americas by West European nations such as Spain, England and Ireland where the levels of Haplogroup R are near 90%.  The Atlantic Modal Haplotype, a common haplotype on both sides of the Atlantic Ocean, is also linked to Haplogroup R.   


Overall worldwide frequency and distribution of Y-DNA Haplogroup R:


Figure 5.  Overall worldwide frequency and distribution of Y-DNA Haplogroup R. Its presence in North America reflects the well known slave trade from the 17th to 19th centuries and it is not present in the indigenous populations in the Americas.  It is also generally absent from Africa, except for marginal levels in the Northeast and an interesting pocket of abundance in North Cameroon populations.  


Y-DNA haplogroup R may be the most numerous Y-DNA in the world today.  It is the most prominent haplogroup in Europe at ~50% and the United States at ~42%.  The origin Haplogroup R is located in Central to West Asia, although a precise region has not yet been determined.  The origin of Haplogroup R dates to 30-35kya in the Paleolithic Era and Pleistocene Epoch.  Its entry into Europe at this point coincides with the spread of the Aurignacian culture across Eurasia.  Haplogroup R is further linked with the spread of proto-Indo-European languages that took hold as the early languages in large portions of Europe and Asia.   

Glacial advances forced retreats of humans, including populations with Haplogroup R into refugia – one in Iberia (the R1b subclade) and one in Ukraine (forerunner R1a subclade).  The expansion of both populations occurred after the last glacial maximum (LGM) ~18kya.  The R1b branch is the most abundant European haplogroup and is notoriously prominent in West Europe (levels around 90% in British Isles), while R1a is more prevalent occupying East Europe and neighboring West Asia.  The complementing pattern of R1b in the west and R1a in the east covers all of Europe and much of Asia.   The overall trend for Haplogroup R is highest frequencies in Northwest Europe and decreasing in a Southwestern direction.  Both main branches are found in Asia and can be found in Egypt, providing another example of genealogical backflow from Asia into Africa.   

R1b (~30kya) was a branch founded earlier than R1a (15-10kya).  R1a has been associated with the earliest Indo-European language populations and an early Asian culture, the Kurgan culture.    Haplogroup R1a has also spread to the east and reached Southeast Asia and Australia, albeit at modest levels (≤10%).  A competing hypothesis places the origin of R1a in India and at this point there is no conclusive evidence to pinpoint an exact origin.  

The R1b refugium is hypothesized to reside in the Cantabrian refugium in the Iberian Peninsula, possibly in the Franco-Cantabrian region.  R1b1b is the most frequent subclade in this branch of Haplogroup R and has spread from Western Europe through Imperial colonizations to the New World – e.g. the Americas.  Most R1b1 populations have then been sub-classified as R1b1b2 (M269). 


Figure 6. The emergence of modern humans.  A schematic timeline is shown with the approximate appearance of Homo sapiens, with particular attention to the estimated origins of different ancestral Y-DNA haplogroups.  Below the timeline is shown key geological and anthropological events.   kya = thousand years ago, hg = haplogroup, LGM = last glacial maximum.

There is another independent Haplogroup R sub-lineage designated R2.  R2 is likely to have originated (~25kya) in South Asia, around India/Pakistan.  The R2 subclade is highest in East India (50-60%) and Sri Lanka (75%).  Some R2 has been observed in the Caucasus and Central Asia (e.g. Nepal), but it has not spread beyond these regions, except with a Gypsy population (Sinte Romani), which likely originated in India.  There is relatively high level in Kurds in Georgia (44% of Kurmanji) and this population is likely an outcome of a bottleneck and genetic drift.


BOX 1.  Famous R people
There are two examples of founders of large contemporary R subclade populations – a dynasty phenomenon like that of Genghis Khan in Asia who has been reported to have over 15 million descendants living today and is the most prolific patriarch yet described.

One example is Niall Noigíallach – 5th Century High king (warlord) of Ireland, who was also known as ‘Niall of the Nine Hostages’.  He was a very probable member of the R1b1b2 subclade (SNP marker M269) and possibly downstream R1b1b12a2e/M222.  Millions of present day descendents of Niall, the Uí Néill, can be attributed to this patriarch.  This Gaelic population also bears a high frequency of the Irish Modal Haplotype (IMH).  The common ancestor was dated to ~500-1000AD and is consistent with appearance of surnames attributed to the Uí Néill clans.
The second instance occurred in 12th Century Scotland, with Somerled of Argyll (ruled 1158-1164AD). Somerled, who was known as the King of the Hebrides, united a group of Norse/Vikings and the Scots (the new Norse-Gaels) from competing Viking clans, keeping his kingdom separate from other prevailing powers in Norway and Scotland.  It has been suggested that Somerled was a partriarch of a half a million descendents, which carry the R1a subclade.  This provides a link for the shared genetic ancestry between Scandinavia and Scotland, particularly the Scandinavian influence on the Isle of Man and the Scottish Isles; Orkney, Shetland and the Hebrides.
 



The frequency and distribution of Haplogroup R in Europe:


Figure 7. A map illustrating the frequency and distribution of Haplogroup R in Europe.  The frequency of haplogroup R is shown as the blue portion of the pie charts distributed over different locations.  The highest concentration of this haplogroup is in the Basque region that spans the border between Spain and France and in the Scottish Islands.    


The frequency and distribution of Haplogroup R in West Asia and the Middle East:


Figure 8. A map illustrating the frequency and distribution of Haplogroup R in West Asia and the Middle East.  The portion of the pie charts colored in blue represents the fraction of Haplogroup R among Y-DNA in different regions.  The frequency of Haplogroup R declines in the Levant Corridor and the Arabian Peninsula.   


The frequency and distribution of Haplogroup R in Central and East Asia:


Figure 9. A map illustrating the frequency and distribution of Haplogroup R in Central and East Asia.  The portion of the pie charts colored in blue represents the fraction of Haplogroup R among Y-DNA in different regions.   The frequency of Haplogroup R remains high in South Central Asia (e.g. India) but the levels are considerably lower elsewhere in Asia.   


The frequency and distribution of Haplogroup R in Africa:

Figure 10. A map illustrating the frequency and distribution of Haplogroup R in Africa.  The portion of the pie charts colored in blue represents the fraction of Haplogroup R among Y-DNA in different regions.   The level of Haplogroup R is generally low in Africa (10% or much less) and found mostly in the Northeast.  The interesting exception is in North Cameroon with levels ~75%, while surrounding regions lack this haplogroup.     


The movements of human populations bearing Haplogroup R:


Figure 11.  A map depicting the movements of human populations bearing Haplogroup R.   The origin of haplogroup R has not been pinpointed, although most evidence leads to a general placement in Central Asia and around the Eurasian Steppes (marked in beige).   Both the R1a and R1b lineages proceeded west into Europe, although the origin of R1b precedes that of R1a (~25kya vs. ~15kya).  The R1b subclade established high levels in the Atlantic region after dispersal from an Iberian refuge (light blue oval) during the last glacial maximum of the Ice Ages.  The R1a refugium is located in the Ukraine (light blue oval).  Both R1a and R1b are found in Scandinavia and two different migration routes have been proposed for both R lineages: west over Jutland (Denmark) and east through the Baltic states.  R1b and R1a appear to have entered Anatolia from opposite sides – R1b over the Bosporus Isthmus and R1a from the Iranian Plateau.  R1a1 is prominent in East Europe and Asia, including modest levels in East Asia.  It also is found in the MidEast and Northeast Africa.  R1b is also found in Africa and likely entered via Anatolia and the Levant corridor through the Middle East.  The R2 subclade is a distinct R lineage that is confined to India, Pakistan and Central Asia.  It likely arose in India.  The routes and locations are based on evidence from several studies of Y-chromosome R haplogroups and haplotypes and this map represents one possible scenario for the ancestral origin and propagation of Haplogroup R.  



How the Subclades of Y-DNA Haplogroup R are determined

The further refinement of Y-DNA ancestry can be obtained by using the Y-DNA Haplogroup R Subclade Testing Panel from Genebase.  This panel is based upon a collection of SNPs (see Table 1) that identify the sub-branches of Y-DNA Haplogroup R.

Table 1.  SNPs in the Genebase Haplogroup R Subclade Panel


The following diagram depicts the current phylogenetic tree for Y-DNA Haplogroup R.  This haplogroup has many subclades, most stemming from the R1b branch of the tree.  


Figure 12.  The current phylogenetic tree for Y-DNA haplogroup R and its subclades.  The location of subclade-defining SNPs is shown above the subclade names (boxed). 


The procedure for identifying your Y-DNA Haplogroup R Subclade is as follows:

Your Y-DNA Subclade will be automatically determined for you after your Sublcade test is completed.  However, if you are interested in finding out how your subclade was determined, just follow these steps:

Step 1.  Examine your test results from the Genebase Y-DNA Haplogroup R Subclade Testing Panel.  Keep track of all your positive or derived SNP states and consult the Haplogroup R Subclade phylogenetic tree diagram (see Figure 12). 

Step 2.  Start with the root or main branch of Haplogroup R, which is ascertained by the presence of SNP M207.  According to R subclade test panel results, follow or trace the branches with your SNPs from the Genebase Y-chromosome Haplogroup R Subclade Testing Panel.  The point at which you no longer have mutations to follow is the branch or subclade of haplogroup R to which you belong!

There are currently 27 SNPs and subclades available in the Genebase Y-DNA Haplogroup R Subclade Testing Panel.   Major R branches: R1 defined by M173 and R2 defined by M124, can be determined with the The Genebase Y-DNA backbone SNP Test Panel.  The R2 branch currently has no further subclades defined and appears to be terminal branch in the Y-DNA phylogeny.  The following section describes subclades descending from the R1/M173 branch of Haplogroup R. 


Geographical Distribution of the Subclades of Y-DNA Haplogroup R


The following reference maps illustrate how the various subclades of Y-DNA Haplogroup R are distributed.


Frequency and distribution of the Subclades of Haplogroup R in Europe:


Figure 13. A map illustrating the frequency and distribution of the Subclades of Haplogroup R in Europe.    In some cases multiple pie charts are shown in a particular location.  These represent the the results of different and independent studies that tracked similar populations, but reported different subclades because different SNP tests were employed  (e.g. see Sweden).  Note that R1a lineages generally do not appear west of Germany and Italy.  The R1b1a subclade is unique to Sardinia and a several subclades are unique to the Iberian Peninsula (R1b1b2a2b, R1b1b2a2c and R1b1b2a2d).  See Table 3 for a detailed account of these frequency and distribution of R subclades.  Paragroups (*) represent a positive SNP status, but lacking further subclade marker identification (i.e. identification of terminal subclade). 


Frequency and distribution of the Subclades of Haplogroup R in West Asia:


Figure 14. A map illustrating the frequency and distribution of the Subclades of Haplogroup R in West Asia.  Note the appearance of the R2 subclade in green and the decrease in R1b1b2/M269 in the eastern direction.  See Table 3 for a detailed account of these frequency and distribution of R subclades.  Paragroups (*) represent a positive SNP status, but lacking further subclade marker identification or the identification of terminal subclade. 


Frequency and distribution of the Subclades of Haplogroup R in the Caucasus:


Figure 15.  A map illustrating the frequency and distribution of the Subclades of Haplogroup R in the Caucasus.  See Table 3 for a detailed account of these frequency and distribution of R subclades.  Paragroups (*) represent a positive SNP status, but lacking further subclade marker identification or identification of terminal subclade.  Note that the total frequency is higher in the South Caucasus, but the proportion of R1a1 subclade vs. undifferentiated R* is higher in the North Caucasus.


Frequency and distribution of the Subclades of Haplogroup R in the India:


Figure 16.  A map illustrating the frequency and distribution of the Subclades of Haplogroup R in the India.  The R2 subclade is at its highest frequency in India (peak observed in Sri Lanka).  Also notable is that R1a1 predominates over R1b1b2 in India.  See Table 3 for a detailed account of these frequency and distribution of R subclades.  Paragroups (*) represent a positive SNP status, but lacking further subclade marker identification or identification of terminal subclade. 


Frequency and distribution of the Subclades of Haplogroup R in Central and East Asia:


Figure 17.  A map illustrating the frequency and distribution of the Subclades of Haplogroup R in Central and East Asia.  R1a1 is the most abundant of the R subclades but interesting new subclades appear in China and Japan (R1b1b1 in blue) and Australia (R1b1b2a2a in peach).   See Table 3 for a detailed account of these frequency and distribution of R subclades.  Paragroups (*) represent a positive SNP status, but lacking further subclade marker identification or identification of terminal subclade. 


Frequency and distribution of the Subclades of Haplogroup R in Africa:


Figure 18.  A map illustrating the frequency and distribution of the Subclades of Haplogroup R in Africa.   See Table 3 for a detailed account of these frequency and distribution of R subclades.  Paragroups (*) represent a positive SNP status, but lacking further subclade marker identification or identification of terminal subclade.  It is clear that most of the R haplogroup contribution is the undefined R1 lineage or the R1b lineage (e.g. R1b1/P25 and R1b1b2/M269).  


Frequency and distribution of the Subclades of Haplogroup R in the United States:


Figure 19.  A map illustrating the frequency and distribution of the Subclades of Haplogroup R in the United States.   Results from four different studies on different populations in the US are shown.  The highest R frequency is observed in European or Caucasian Americans, followed by Hispanic/Latino Americans and African Americans, while very low levels are observed in Japanese or Asian Americans.  The R1b subclades represent the majority of subclades observed (e.g. R1b1/P25 and R1b1b2/M269).   Novel subclades defined with new SNPs are shown in the African and European American populations shown at the lower right (subclades: R1b1b2a/S127, R1b1b2a1a/S29 and R1b1b2a2e/M222.  See Table 3 for a detailed account of these frequency and distribution of R subclades.  Paragroups (*) represent a positive SNP status, but lacking further subclade marker identification or identification of terminal subclade.  


Detailed Accounts for the Subclades of Y-DNA Haplogroup R

Y-DNA Haplogroup R subclades are found throughout the world.  Subclades R1a1 and R1b prevail in Europe.  R1a1 is more prominent in East Europe and Asia, where the R1b frequency diminishes.  On the other hand, R1b is more prominent in the Americas due to explorations, imperial conquests, immigrations and slave trade by West European countries from 17th to 21st centuries.   

R1a1 (SRY10831.2+), previously known as R1a

The R1a1 subclade is a prominent R branch that is found widely in Asia and Eastern Europe.  Most of the information regarding R1a1 distribution and frequency comes from analysis of downstream subclades.  Please refer below which are part of the [Genebase Y-chromosome Haplogroup R Subclade Testing Panel].

R1a1a (M198+), previously known as R1a1

The R1a1a (M198) subclade is the most prominent member of the R1a1 branch and is abundantly found in Asia.  Its highest levels occur in Central Asia (e.g. Russia, Kyrgyzstan) and over wide regions its frequency averages 50%.  R1a1 is also abundant in South Asia (e.g. India).  It has likely to have migrated from Central Asia into East Asia.    

The R1a1a subclade also successfully migrated to the West, making it also rather abundant in Eastern Europe among Slavic populations.    R1a1a is the most frequent Y subclade in Russia and countries near the Baltic Sea.   The presence of R1a1 in Southeast Europe may have come from the proposed Ukranian refuge (STR variance is found at its highest here) after the last glacial maximum (LGM ~18kya). Additionally, other expansion episodes from a population in the Pontic steppes some 3-5kya or the movements of Slavic populations in more recent times (5th to 7th centuries AD) have been conjectured.  It is possible that all three episodes contributed to the current demography of the R1a1 subclade.  Estimates for expansion times are at least consistent with the oldest post-LGM colonization by R1a1.    The post-LGM movements coincided with Kurgan and Yamnaian cultures that spread from Central Asia. 

A competing hypothesis places the origin of R1a1 and R1a1a in North India, since it is prevalent in the Indian subcontinent and the strong haplotype diversity suggests very ancient origins of this subclade.  At this point, the precise origin is still under debate. 

There is little evidence to suggest that the contemporary distribution of R1a1a follows cultural or linguistic affinities.  A boundary in R1a1a/M198 frequency between Poland and Germany can be attributed to post World War II resettlement.   Also, an influence from Scandinavia can been seen in certain locations in the British Isles, which is evidenced by small spikes in the levels of R1a1a/M198 that is normally at low levels in the British Isles.  The Scottish Isles show good evidence of the increased R1a1a ‘Scandinavian’ Y-chromosome types.  This R1a1a ingress is likely due to Viking invaders and resettlement of refugees from Ireland in 10th century AD.  (see Box 1) 

R1a1a is present in Iran, but more so in the South than the North.  Persian deserts may have presented barriers in a spread to the north of Iran.  It is not clear if there were barriers that prevented the flow of R1a1a lineages from the North (Caucasus, Turkmenistan) into Iran.   A source from either the Iranian plateau or the Caucasus is suggested to underlie the presence of R1a1a in Anatolia (Turkey).  In the Arabian Peninsula, the TMRCA for the R1a1a/M198 subclade in UAE and Qatar is 7-11kya.

While the R1a1a/M198 subclade is not especially abundant in Ashkenazim Jews (8-13%), it is higher compared to other Jewish populations (Sephardic and Kurdish Jews ~4%) and there is evidence for a founder effect by this subclade in the European Jewish community dating approximately to 1000AD.  This could represent relatively recent gene flow from East European non-Jews where there is a high frequency of the R1a1a/M198 subclade (15-30%). 

R1a1a1 (M56+), previously known as R1a1a

The E1b1a5 (M155) subclade has not been studied extensively. It appears with low frequency in Central Asia and Siberia (0.5%) and it is likely to be a minor subclade.  Check this site regularly for updates on this subclade as new information will be posted as studies become available.

R1a1a2 (M157+), previously known as R1a1b

The R1a1a2 (M157) subclade has not been studied extensively. It appears with low frequency in Pakistan and India (1.1%) and may well be a minor subclade.  Check this site regularly for updates on this subclade as new information will be posted as studies become available.

R1a1a3 (M87+), previously known as R1b1c

The R1a1a3 (M87+) subclade has not been studied extensively. It appears with low frequency in Arab and Bukhara population in Uzbekistan (7%) and may be confined to Central Asia.  Check this site regularly for updates on this subclade as new information will be posted as studies become available.

R1a1a4 (P98+), previously known as R1a1d

Currently, no information is available for the distribution and frequency of this Haplogroup R subclade. Check this site regularly for updates on this subclade as new information will be posted as studies become available.

R1a1a5 (PK5+), previously known as R1a1e

The R1a1a5 (PK5+) subclade has been described in solely in Pakistan (0.2%).  This subclade needs to be tested over other populations to be certain that the rarity is not due to insufficient sampling.  The TMRCA for R1a1a5 (PK5+) was determined to be very recent, extending back only 350 years ago – a range in which the STR-based haplotypes are nearly identical.  Check this site regularly for updates on this subclade as new information will be posted as studies become available.

R1b (M343+)

R1b is extremely abundant in Atlantic coast regions of Europe, such as Iberian Peninsula and the British Isles.  It may have migrated into Europe 30kya, in one of the first peoplings of this continent and the timing provides a link to the early Aurignacian culture that spread over Europe.  Evidence points to a refugium found in the Iberian Peninsula – likely in the Franco-Cantabria region.  Additional reports suggest the possibility of refuges in South East Europe around the Aegean Sea and in the Italian Peninsula. 

As with R1a, most of the information regarding R1b distribution and frequency comes from analysis of downstream subclades.  Please refer below which are part of the [Genebase Y-chromosome Haplogroup R Subclade Testing Panel].

R1b1 (P25+)

The P25 SNP, which defines the R1b1 lineage, has been shown to undergo reversion at a modest frequency.  This means that the P25 SNP reverts to the ancestral allele and the R1b1 subclade status is missed and results in under-reporting of its frequency.  Because of this, it is recommended that upstream (SNP M343) and downstream SNP test results (see SNPs listed below) are evaluated carefully using the [Genebase Y-chromosome Haplogroup R Subclade Testing Panel]. 

In Europe, the R lineages (R1b1/P25 and R1a1/M198) are found at a higher level in non-Jewish vs. Ashkenazi Jewish populations.  One exception to this rule is Dutch Jews, who exhibit a level of R1b1/P25 close to the non-Jews (~26%) a probable result of admixture between non-Jews and Jews in this part of Europe.

A study of Aromuns in the Balkan Peninsula has identified an increase in the frequency of the R1b1/P25 subclade vs. non-Aromun populations in the Balkans.

Another interesting R1b1/P25 pocket lies in North Cameroon.  The presence of the R haplogroup in Africa has been noted above, including the unusual abundance in North Cameroon (60-90%!).  The best supported explanation for this finding is back migration from Asia and a TMRCA estimate for the R1/M173 parental lineage is ~4kya.  The finding of a specific subclade rather than an undifferentiated R1 lineage may help to further understand the migrations of people from Asia back into Africa.

R1b1a (M18+)

A few R1 subclades appear in unique locations, such as R1b1a (M18) in Sardinia (1-5%).  This subclade has not been widely tested, but the only other incidence outside of Sardinia occurs in Lebanon at very low frequency (0.5%).  It is possible that the R1b1a/M18 subclade on Sardinia is subject to genetic drift as an isolated population.  The TMRCA for R1b1a (M18) in Sardinia was estimated at 8-11kya.  This may be a minor R subclade.  Check this site regularly for updates on this subclade as new information will be posted as studies become available.

R1b1b (P297+)

P297 is a recently discovered marker (2008).  Individuals prior to the discovery of P297, all individuals who are currently classified as P297+ would previously have been classified as P25+.  P297 is a subset of P25+ individuals.  Check this site regularly for updates on this subclade as new information will be posted as studies become available.

R1b1b1 (M73+)

The R1b1b1 subclade, identified with SNP M73, has not been tested widely, but has been found in Central and East Asia: India, Russia, Pakistan, Turkey, China and Japan.  The frequencies are in the 5-10% range.  It presents an interesting scenario of an R1b subclade that is found in Asia, where R1b lineages are typically less frequent.  Check this site regularly for updates on this subclade as new information will be posted as studies become available.

R1b1b2 (M269+)

The R1b1b2 (M269+) subclade is most prominent representative of the R1b branch and is abundant over Western Europe, especially in Atlantic coast countries.  Nearly 90% of Basques carry this subclade and downstream or derivative subclades have been detected among this population: e.g. R1b1b12a2c (M153+) and R1b1b12a2d (M167+).   These latter subclades are likely to have originated in the Basques.  TMRCA for R1b1b12a2c (M153+) is quite ancient: 18-21kya.  TMRCA for R1b1b2 (M269+) in Sardinia was estimated at 23kya and in Sweden at 9kya.  The R1b1b2 (M269+) is the most common subclade in the U.S. due to its hegemony in Western Europe and the latter’s colonization of the Americas. 

A glacial refugium of the R1b1b2 (M269+) subclade has been suggested to lie in Anatolia (Turkey) and may have entered this region via the Bosporus Isthmus.  The presence of R1b in Lebanon is linked to European invasion during the Crusades (11th – 13th centuries AD and likely typified by the so called WES1 modal haplotype – see Table 2) and Muslim expansion (beginning in the 7th century AD).

R1b1b2 (M269+) is also found India in Iran and is somewhat higher in the North vs. South for both locations.  R1b1b2 is present in Africa and the United States and it is apparent that as a prolific subclade, this Y-chromosome has traveled extensively. 

R1b1b2a1 (S127+), previously known as R1b1b2a

S127 is a very recently discovered marker (2008).  Individuals who are S127+ were previously classified as M269+ (R1b1b2).  So far, published studies have not examined S127 yet except in the United States.  In the U.S. the R1b1b2a (S127+) subclade frequency is higher in European Americans (~14%) versus African Americans (~2%).  Check this site regularly for updates on this subclade as new information will be posted as studies become available.

R1b1b2a1a (S21+), previously known as R1b1b2a

Unusually short DYS458 alleles (DYS458.2) are associated with R1b1b2a1 (S21, aka M405).  Cases of this allele have so far been detected in Ireland, England, Germany, the Netherlands and the U.S. (1-5%) and this appears to a unique west European marker.   The DYS458.2 allele also occurs independently in the Y-chromosome J1 subclade. 

The R1b1b2a1 (S21+) is a prominent R1b subclade and is likely the major subclade in resolving identity after the R1b1b2 (M269+) subclade.  It frequency is highest in the Dutch (35%) and it is also rather high in England, Germany, Austria, Denmark, Czech Republic and Switzerland (13-23%).  This region overlaps origins of Germanic groups, such as the Anglo-Saxons in Frisia.  It does not appear to have extended its reach beyond West and Central Europe (except recent migration to the U.S.). 

The levels of the R1b1b2a (S127+) subclade in conjunction with other R subclades have not been reported to date.  Check this site regularly for updates on this subclade as new information will be posted as studies become available.

R1b1b2a1a1 (S29+), previously known as R1b1b2a1a

The R1b1b2a1a1 subclade is defined by SNP S29.  This subclade has not been widely studied, but current results show it the Netherlands, Denmark, England, Germany and Russia (1-2%).  A small frequency of the R1b1b2a1a1 (S29+) subclade has also been found at a low level in the U.S. (~1%).   Check this site regularly for updates on this subclade as new information will be posted as studies become available.

R1b1b2a1a2 (P107+), previously known as R1b1b2a1b

Currently, no information is available for the distribution and frequency of this Haplogroup R subclade. Check this site regularly for updates on this subclade as new information will be posted as studies become available.

R1b1b2a1a3 (S26+), previously known as R1b1b2a1c

Currently, no information is available for the distribution and frequency of this Haplogroup R subclade. Check this site regularly for updates on this subclade as new information will be posted as studies become available.

R1b1b2a1b (S116+), previously known as R1b1b2a2

S116 is a newly discovered marker (2008) and was very recently added to the R phylogenetic tree.  Prior to its discovery, all individuals who are currently S116+ were previously known as M269+ (R1b1b2).  Check this site regularly for updates on this subclade as new information will be posted as studies become available.

R1b1b2a1b6a (M37+), previously known as R1b1b2a2a

The only report of the presence of the R1b1b2a1b6a (M37+) subclade is an identification of this lineage in Australia at a relatively high frequency (2/7 males), although the sample number is rather low and may not be representative.  Further testing will be needed to determine the distribution and frequency of this subclade and if it appears outside of Australia.  Check this site regularly for updates on this subclade as new information will be posted as studies become available.

R1b1b2a1b1 (M65+), previously known as R1b1b2a2b

R1b1b2a1b1 (M65+) appears at a modest frequency (4.4%) in Basques.  It has not been found elsewhere and may have originated among Basque populations as its not found in Catalan or Andalusian populations in the Iberian Peninsula.  Check this site regularly for updates on this subclade as new information will be posted as studies become available.

R1b1b2a1b2 (M153+), previously known as R1b1b2a2c

R1b1b2a1b2 (M153+) appears at a modest frequency (1-10%) in populations in the Iberian Peninsula.   As with R1b1b2a1b1 (M65+), the R1b1b2a1b2 (M153+) subclade may be unique to the Basques (levels in non-Basque Iberians ~1%), though additional studies are needed to confirm this hypothesis.  A small fraction was observed in Latinos and Caucasians in the U.S. (1-2%).  Check this site regularly for updates on this subclade as new information will be posted as studies become available.

R1b1b2a1b3 (M167+), previously known as R1b1b2a2d

The R1b1b2a1b3 (M167+) subclade appears at a modest frequency (1-10%) in populations in the Iberian Peninsula and may be also associated with Basque population.  While in some studies its level was found to be lower among non-Basques, in another study it was found in 31% of Catalans and thus may not be confined to the Basque and more representative of the Iberian Peninsula in general.  In support of this, the R1b1b2a1b3 (M167+) has been found in the Azores (1.7%) as well as North Africa (~1% Tunisia), which is also indicative of a limited trans-Mediterranean migration from Iberia into Africa.   

This subclade has also been reported at low levels in the U.S. (1-2%).   Check this site regularly for updates on this subclade as new information will be posted as studies become available.

R1b1b2a1b6b (M222+), previously known as R1b1b2a2e

The IMH (see Table 2) is frequently associated with the R1b1b2a1b6b (M222+) subclade, which appears linked to Irish populations and people of European and African descent in the U.S. (1-3%).  The R1b1b2a1b6b (M222+) subclade may be a minor subclade but it has not been studied extensively. Check this site regularly for updates on this subclade as new information will be posted as studies become available.

R1b1b2a1b6c (P66+), previously known as R1b1b2a2f

Currently, no information is available for the distribution and frequency of this Haplogroup R subclade. Check this site regularly for updates on this subclade as new information will be posted as studies become available.

R1b1b2a1b4 (S28+), previously known as R1b1b2a2g

The R1b1b2a1b4 (S28+) subclade has been found in 3.2% of Caucasian Americans and 0.8% of African Americans.  It has not been studied outside the U.S. and its distribution and frequency remains largely unknown.  Check this site regularly for updates on this subclade as new information will be posted as studies become available.

R1b1b2a1b4a (M126+), previously known as R1b1b2a2g1

The only report of the R1b1b2a1b4a (M126+) subclade is from a pool of Europeans (1.7% or 1/60) and therefore it appears to be a minor subclade.  Check this site regularly for updates on this subclade as new information will be posted as studies become available.

R1b1b2a1b4b (M160+), previously known as R1b1b2a2g2

Similar to the findings with R1b1b2a1b4a (M126+), the only report of the R1b1b2a1b4b (M160+) subclade is from a pool of Europeans (5% or 3/60) and therefore it appears to be a minor subclade.  Check this site regularly for updates on this subclade as new information will be posted as studies become available.

R1b1c (M335+)

The R1b1c subclade is described by the M335 SNP.  The only incidence of this subclade so far reported is a very low level in Northeast Anatolia (1.2%).  Since it has not been studied extensively it is not possible to speculate on its frequency and distribution.  Check this site regularly for updates on this subclade as new information will be posted as studies become available.

R2 (M124+)

The Haplogroup R2 subclade is defined by the presence of the M124 SNP.   The R2 branch is sibling branch in relation to the R1 branch, but it currently has no further subclades identified and appears to be terminal branch in the phylogeny of the Y-chromosome.  The distribution of R2 is clearly distinct from that of the R1a or R1b branches and the separation of the R2 branch is also supported by the clear difference in modal haplotypes (see Table 2). 
 
The R2 subclade originated approximately 25kya in South Asia, around India/Pakistan or to the north of this area in Central Asia and it has spread little beyond these regions of the world.  The R2 subclade frequency is highest in Sri Lanka (75%) and next most abundant in East India (50-60%).  Some R2 has been observed in Central Asia (e.g. Nepal) and the Caucasus (e.g. Georgia).  There is relatively high level in Kurds in Georgia (44% of Kurmanji) and this population is likely an outcome of a bottleneck and perhaps has experienced genetic drift.  It is not nearly as high in Kurmanji from Turkmenistan (8%).  Its presence in the Sinte Romani Gypsy population (Sinti) of Europe can be likely attributed to their origins in India or Pakistan. 

While the R2 subclade can be found at appreciable levels in India, it is not found preferentially among a linguistic affiliation, specific caste or tribal group, with one clear exception; it was found to constitute 87% of Y-chromosomes in Jaunpur Kshatriya caste population in one study.   This case may represent a recent founder effect in this group, with a current high percentage of the R2 subclade maintained by the caste system.



Modal Haplotypes Associated with Y-DNA Haplogroup R

Your unique set of Y-DNA STR markers obtained through the Y-DNA STR test is referred to as your "Haplotype".  This is not to be confused with your "Haplogroup" which is determined by testing the SNP markers in your Y-DNA through Y-DNA SNP backbone and subclade testing.

When the Y-DNA STR markers are tested for large groups of people from around the world, the haplotypes which occur with the highest frequencies within certain populations are called "Modal Haplotypes". 

A few notable STR-based haplotypes have been described for subclades in Haplogroup R.  The most prominent of these is the Atlantic Modal Haplotype that is widespread on both sides of the Atlantic Ocean.  This is due to its association with the R1b subclade, which is by far the most abundant Y-chromosome lineage in Western Europe, where it reaches nearly 100% frequency in some populations in the British Isles and the Iberian Peninsula.  Given the hegemony in Western European nations, it is not surprising to find a high level of R1b and AMH also in the Americas.  A few related R haplotypes, e.g. Irish Modal Haplotype and a modal haplotype for the Basque population are also linked to the West European R1b ancestry.  It has been noted that the IMH is largely absent from neighboring England, but could be found at modest frequencies in Scotland and the U.S.    

Confirmation of haplogroup assignment is always made by SNP testing.  Conversely, haplogroup assignment does not indicate that you will have the modal haplotype, recalling the fact that STRs are rapidly changing markers.  Table 2 provides a list of modal haplotypes associated with Haplogroup R.

Table 2.  Modal Haplotypes associated with Y-DNA Haplogroup R

DYS Number

19

388

390

391

392

393

389i

389ii

Atlantic Modal Haplotype

14

12

24

11

13

13

13

29

Irish Modal Haplotype

14

12

25

11

14

13

13

29

Basque Modal Haplotype

14

NT

24

11

13

13

NT

NT

Iberia R1b1b2

14

12

24

11

13

13

13

29

Anatolia R1b1b2

14

12

24

11

12

13

16

29

Western European Specific 1 R1b

14

NT

24

10

13

13

12

28

Scottish Isles R1a1

16

12

25

11

11

13

NT

NT

Iceland R1a

15

12

25

11

11

13

10

27

Eastern Europe R1a

16

12

25

10

11

13

13

30

India (Chenchu) R1a

16

12

24

11

11

13

NT

NT

English R1a

16

12

25

11

11

13

13

31

Ashkenazim R1a1

16

12

25

10

11

13

NT

NT




BOX 2.   India: Castes, Tribes and Language
The cultural structure of caste systems (e.g. social stratification, endogamy) in the India has been suggested to create a unique genetic legacy in this region of Asia.  There are several early papers that suggested that there was correlation between R haplogroup and subclade frequencies and caste or tribe status in India.  Other studies have not supported these associations.  Similarly, studies have attempted to determine if there is a link between language and Y-chromosome phylogeography.  Indo-European language, which is linked to the R1 haplogroup, is primarily spoken by people in castes, but there are several tribal groups that provide exceptions to this rule.  In addition, the R2 subclade which has likely originated in India, is not found preferentially among specific caste or tribal groups, with one clear exception; it makes up 87% of Y-chromosomes in Jaunpur Kshatriya caste population and this case may represent a recent founder effect in this group.  It is likely that the founding of the R lineage predates (TMRCA estimated ~12-14kya in India) many of these social constructs and thus has varying contributions among many Indian populations.  With the current R subclade resolution and populations sampled so far studied, the broad distribution patterns of R subclades appear to follow geographical rather than cultural boundaries.    


Resources/Bibliography

Public: Full article PDF available
1. Alonso S, Flores C, Cabrera V, Alonso A, Martín P, Albarrán C, Izagirre N, de la Rúa C, García O. The place of the Basques in the European Y-chromosome diversity landscape. Eur J Hum Genet. 2005 Dec;13(12):1293-302. PMID: 16094307

2. Al-Zahery N, Semino O, Benuzzi G, Magri C, Passarino G, Torroni A, Santachiara-Benerecetti AS. Y-chromosome and mtDNA polymorphisms in Iraq, a crossroad of the early human dispersal and of post-Neolithic migrations. Mol Phylogenet Evol. 2003 Sep;28(3):458-72. PMID: 12927131

3. Balanovsky O, Rootsi S, Pshenichnov A, Kivisild T, Churnosov M, Evseeva I, Pocheshkhova E, Boldyreva M, Yankovsky N, Balanovska E, Villems R. Two sources of the Russian patrilineal heritage in their Eurasian context. Am J Hum Genet. 2008 Jan;82(1):236-50.  PMID: 18179905

4. Bamshad M, Kivisild T, Watkins WS, Dixon ME, Ricker CE, Rao BB, Naidu JM, Prasad BV, Reddy PG, Rasanayagam A, Papiha SS, Villems R, Redd AJ, Hammer MF, Nguyen SV, Carroll ML, Batzer MA, Jorde LB. Genetic evidence on the origins of Indian caste populations. Genome Res. 2001 Jun;11(6):994-1004. PMID: 11381027

5. Behar DM, Garrigan D, Kaplan ME, Mobasher Z, Rosengarten D, Karafet TM, Quintana-Murci L, Ostrer H, Skorecki K, Hammer MF. Contrasting patterns of Y chromosome variation in Ashkenazi Jewish and host non-Jewish European populations. Hum Genet. 2004 Mar;114(4):354-65. Epub 2004 Jan 22. PMID: 14740294

6. Bosch E, Calafell F, González-Neira A, Flaiz C, Mateu E, Scheil HG, Huckenbeck W, Efremovska L, Mikerezi I, Xirotiris N, Grasa C, Schmidt H, Comas D. Paternal and maternal lineages in the Balkans show a homogeneous landscape over linguistic barriers, except for the isolated Aromuns. Ann Hum Genet. 2006 Jul;70(Pt 4):459-87. PMID: 16759179

7. Bosch E, Calafell F, Comas D, Oefner PJ, Underhill PA, Bertranpetit J. High-resolution analysis of human Y-chromosome variation shows a sharp discontinuity and limited gene flow between northwestern Africa and the Iberian Peninsula.  Am J Hum Genet. 2001 Apr;68(4):1019-29. Epub 2001 Mar 14. PMID: 11254456

8. Bowden GR, Balaresque P, King TE, Hansen Z, Lee AC, Pergl-Wilson G, Hurley E, Roberts SJ, Waite P, Jesch J, Jones AL, Thomas MG, Harding SE, Jobling MA. Excavating past population structures by surname-based sampling: the genetic legacy of the Vikings in northwest England. Mol Biol Evol. 2008 Feb;25(2):301-9. Epub 2007 Nov 20. PMID: 18032405

9. Capelli C, Redhead N, Abernethy JK, Gratrix F, Wilson JF, Moen T, Hervig T, Richards M, Stumpf MP, Underhill PA, Bradshaw P, Shaha A, Thomas MG, Bradman N, Goldstein DB.  A Y chromosome census of the British Isles. Curr Biol. 2003 May 27;13(11):979-84. PMID: 12781138

10. Capelli C, Redhead N, Romano V, Calì F, Lefranc G, Delague V, Megarbane A, Felice AE, Pascali VL, Neophytou PI, Poulli Z, Novelletto A, Malaspina P, Terrenato L, Berebbi A, Fellous M, Thomas MG, Goldstein DB. Population structure in the Mediterranean basin: a Y chromosome perspective. Ann Hum Genet. 2006 Mar;70(Pt2):207-25. PMID: 16626331

11. Chaubey G, Karmin M, Metspalu E, Metspalu M, Selvi-Rani D, Singh VK, Parik J, Solnik A, Naidu BP, Kumar A, Adarsh N, Mallick CB, Trivedi B, Prakash S, Reddy R, Shukla P, Bhagat S, Verma S, Vasnik S, Khan I, Barwa A, Sahoo D, Sharma A, Rashid M, Chandra V, Reddy AG, Torroni A, Foley RA, Thangaraj K, Singh L, Kivisild T, Villems R. Phylogeography of mtDNA haplogroup R7 in the Indian peninsula. BMC Evol Biol. 2008 Aug 4;8:227. PMID: 18680585

12. Cinnioğlu C, King R, Kivisild T, Kalfoğlu E, Atasoy S, Cavalleri GL, Lillie AS, Roseman CC, Lin AA, Prince K, Oefner PJ, Shen P, Semino O, Cavalli-Sforza LL, Underhill PA. Excavating Y-chromosome haplotype strata in Anatolia. Hum Genet. 2004 Jan;114(2):127-48. Epub 2003 Oct 29. PMID: 14586639

13. Cordaux R, Aunger R, Bentley G, Nasidze I, Sirajuddin SM, Stoneking M. Independent origins of Indian caste and tribal paternal lineages. Curr Biol. 2004 Feb 3;14(3):231-5. PMID: 14761656

14. Contu D, Morelli L, Santoni F, Foster JW, Francalacci P, Cucca F. Y-chromosome based evidence for pre-neolithic origin of the genetically homogeneous but diverse Sardinian population: inference for association scans. PLoS ONE. 2008 Jan 9;3(1):e1430. PMID: 18183308

15. Cruciani F, Santolamazza P, Shen P, Macaulay V, Moral P, Olckers A, Modiano D, Holmes S, Destro-Bisol G, Coia V, Wallace DC, Oefner PJ, Torroni A, Cavalli-Sforza LL, Scozzari R, Underhill PA. A back migration from Asia to sub-Saharan Africa is supported by high-resolution analysis of human Y-chromosome haplotypes. Am J Hum Genet. 2002 May;70(5):1197-214. Epub 2002 Mar 21. PMID: 11910562

16. Flores C, Maca-Meyer N, González AM, Oefner PJ, Shen P, Pérez JA, Rojas A, Larruga JM, Underhill PA. Reduced genetic structure of the Iberian peninsula revealed by Y-chromosome analysis: implications for population demography. Eur J Hum Genet. 2004 Oct;12(10):855-63. PMID: 15280900

17. Gayden T, Cadenas AM, Regueiro M, Singh NB, Zhivotovsky LA, Underhill PA, Cavalli-Sforza LL, Herrera RJ. The Himalayas as a directional barrier to gene flow. Am J Hum Genet. 2007 May;80(5):884-94. Epub 2007 Apr 4. PMID: 17436243

18. Helgason A, Sigureth ardóttir S, Nicholson J, Sykes B, Hill EW, Bradley DG, Bosnes V, Gulcher JR, Ward R, Stefánsson K. Estimating Scandinavian and Gaelic ancestry in the male settlers of Iceland. Am J Hum Genet. 2000 Sep;67(3):697-717. Epub 2000 Aug 7. PMID: 10931763

19. Karlsson AO, Wallerström T, Götherström A, Holmlund G. Y-chromosome diversity in Sweden - a long-time perspective. Eur J Hum Genet. 2006 Aug;14(8):963-70. Epub 2006 May 24. PMID: 16724001

20. Kayser M, Brauer S, Weiss G, Schiefenhövel W, Underhill P, Shen P, Oefner P, Tommaseo-Ponzetta M, Stoneking M.  Reduced Y-chromosome, but not mitochondrial DNA, diversity in human populations from West New Guinea. Am J Hum Genet. 2003 Feb;72(2):281-302. Epub 2002 Jan 16. PMID: 12532283

21. Kivisild T, Rootsi S, Metspalu M, Mastana S, Kaldma K, Parik J, Metspalu E, Adojaan M, Tolk HV, Stepanov V, Gölge M, Usanga E, Papiha SS, Cinnioğlu C, King R, Cavalli-Sforza L, Underhill PA, Villems R.  The genetic heritage of the earliest settlers persists both in Indian tribal and caste populations. Am J Hum Genet. 2003 Feb;72(2):313-32. Epub 2003 Jan 20. PMID: 12536373

22. Luis JR, Rowold DJ, Regueiro M, Caeiro B, Cinnioğlu C, Roseman C, Underhill PA, Cavalli-Sforza LL, Herrera RJ. The Levant versus the Horn of Africa: evidence for bidirectional corridors of human migrations. Am J Hum Genet. 2004 Mar;74(3):532-44. Epub 2004 Feb 17. Erratum in: Am J Hum Genet. 2004 Apr;74(4):788. PMID: 14973781

23. Marchani EE, Watkins WS, Bulayeva K, Harpending HC, Jorde LB. Culture creates genetic structure in the Caucasus: autosomal, mitochondrial, and Y-chromosomal variation in Daghestan. BMC Genet. 2008 Jul 17;9:47. PMID: 18637195

24. Martinez L, Mirabal S, Luis JR, Herrera RJ. Middle Eastern and European mtDNA lineages characterize populations from eastern Crete. Am J Phys Anthropol. 2008 May 23. [Epub ahead of print] PMID: 18500747

25. McEvoy B, Richards M, Forster P, Bradley DG.  The Longue Durée of genetic ancestry: multiple genetic marker systems and Celtic origins on the Atlantic facade of Europe. Am J Hum Genet. 2004 Oct;75(4):693-702. Epub 2004 Aug 12. PMID: 15309688

26. Moore LT, McEvoy B, Cape E, Simms K, Bradley DG. A Y-chromosome signature of hegemony in Gaelic Ireland. Am J Hum Genet. 2006 Feb;78(2):334-8. Epub 2005 Dec 8. PMID: 16358217

27. Myres NM, Ekins JE, Lin AA, Cavalli-Sforza LL, Woodward SR, Underhill PA. Y-chromosome short tandem repeat DYS458.2 non-consensus alleles occur independently in both binary haplogroups J1-M267 and R1b3-M405. Croat Med J. 2007 Aug;48(4):450-9. PMID: 17696299

28. Nasidze I, Quinque D, Ozturk M, Bendukidze N, Stoneking M. MtDNA and Y-chromosome variation in Kurdish groups. Ann Hum Genet. 2005 Jul;69(Pt 4):401-12. PMID: 15996169

29. Paracchini S, Pearce CL, Kolonel LN, Altshuler D, Henderson BE, Tyler-Smith C. A Y chromosomal influence on prostate cancer risk: the multi-ethnic cohort study. J Med Genet. 2003 Nov;40(11):815-9. PMID: 14627670

30. Passarino G, Cavalleri GL, Lin AA, Cavalli-Sforza LL, Børresen-Dale AL, Underhill PA. Different genetic components in the Norwegian population revealed by the analysis of mtDNA and Y chromosome polymorphisms. Eur J Hum Genet. 2002 Sep;10(9):521-9. PMID: 12173029

31. Pericić M, Lauc LB, Klarić IM, Rootsi S, Janićijevic B, Rudan I, Terzić R, Colak I, Kvesić A, Popović D, Sijacki A, Behluli I, Dordevic D, Efremovska L, Bajec DD, Stefanović BD, Villems R, Rudan P. High-resolution phylogenetic analysis of southeastern Europe traces major episodes of paternal gene flow among Slavic populations. Mol Biol Evol. 2005 Oct;22(10):1964-75. Epub 2005 Jun 8. PMID: 15944443

32. Regueiro M, Cadenas AM, Gayden T, Underhill PA, Herrera RJ. Iran: tricontinental nexus for Y-chromosome driven migration. Hum Hered. 2006;61(3):132-43. Epub 2006 Jun 12. PMID: 16770078

33. Sahoo S, Singh A, Himabindu G, Banerjee J, Sitalaximi T, Gaikwad S, Trivedi R, Endicott P, Kivisild T, Metspalu M, Villems R, Kashyap VK. A prehistory of Indian Y chromosomes: evaluating demic diffusion scenarios. Proc Natl Acad Sci U S A. 2006 Jan 24;103(4):843-8. Epub 2006 Jan 13. PMID: 16415161

34. Sanchez JJ, Hallenberg C, Børsting C, Hernandez A, Morling N. High frequencies of Y chromosome lineages characterized by E3b1, DYS19-11, DYS392-12 in Somali males. Eur J Hum Genet. 2005 Jul;13(7):856-66. PMID: 15756297

35. Semino O, Passarino G, Oefner PJ, Lin AA, Arbuzova S, Beckman LE, De Benedictis G, Francalacci P, Kouvatsi A, Limborska S, Marcikiae M, Mika A, Mika B, Primorac D, Santachiara-Benerecetti AS, Cavalli-Sforza LL, Underhill PA. The genetic legacy of Paleolithic Homo sapiens sapiens in extant Europeans: a Y chromosome perspective. Science. 2000 Nov 10;290(5494):1155-9. PMID: 11073453

36. Sengupta S, Zhivotovsky LA, King R, Mehdi SQ, Edmonds CA, Chow CE, Lin AA, Mitra M, Sil SK, Ramesh A, Usha Rani MV, Thakur CM, Cavalli-Sforza LL, Majumder PP, Underhill PA. Polarity and temporality of high-resolution y-chromosome distributions in India identify both indigenous and exogenous expansions and reveal minor genetic influence of Central Asian pastoralists. Am J Hum Genet. 2006 Feb;78(2):202-21. Epub 2005 Dec 16. PMID: 16400607

37. Shen P, Lavi T, Kivisild T, Chou V, Sengun D, Gefel D, Shpirer I, Woolf E, Hillel J, Feldman MW, Oefner PJ. Reconstruction of patrilineages and matrilineages of Samaritans and other Israeli populations from Y-chromosome and mitochondrial DNA sequence variation. Hum Mutat. 2004 Sep;24(3):248-60. PMID: 15300852

38. Sims LM, Garvey D, Ballantyne J. Sub-populations within the major European and African derived haplogroups R1b3 and E3a are differentiated by previously phylogenetically undefined Y-SNPs. Hum Mutat. 2007 Jan;28(1):97. PMID: 17154278

39. Tambets K, Rootsi S, Kivisild T, Help H, Serk P, Loogväli EL, Tolk HV, Reidla M, Metspalu E, Pliss L, Balanovsky O, Pshenichnov A, Balanovska E, Gubina M, Zhadanov S, Osipova L, Damba L, Voevoda M, Kutuev I, Bermisheva M, Khusnutdinova E, Gusar V, Grechanina E, Parik J, Pennarun E, Richard C, Chaventre A, Moisan JP, Barác L, Pericić M, Rudan P, Terzić R, Mikerezi I, Krumina A, Baumanis V, Koziel S, Rickards O, De Stefano GF, Anagnou N, Pappa KI, Michalodimitrakis E, Ferák V, Füredi S, Komel R, Beckman L, Villems R. The western and eastern roots of the Saami--the story of genetic "outliers" told by mitochondrial DNA and Y chromosomes. Am J Hum Genet. 2004 Apr;74(4):661-82. Epub 2004 Mar 11. PMID: 15024688

40. Underhill PA, Shen P, Lin AA, Jin L, Passarino G, Yang WH, Kauffman E, Bonné-Tamir B, Bertranpetit J, Francalacci P, Ibrahim M, Jenkins T, Kidd JR, Mehdi SQ, Seielstad MT, Wells RS, Piazza A, Davis RW, Feldman MW, Cavalli-Sforza LL, Oefner PJ. Y chromosome sequence variation and the history of human populations. Nat Genet. 2000 Nov;26(3):358-61.  PMID: 11062480

41. Vallone PM, Butler JM. Y-SNP typing of U.S. African American and Caucasian samples using allele-specific hybridization and primer extension. J Forensic Sci. 2004 Jul;49(4):723-32. PMID: 15317186

42. Wells RS, Yuldasheva N, Ruzibakiev R, Underhill PA, Evseeva I, Blue-Smith J, Jin L, Su B, Pitchappan R, Shanmugalakshmi S, Balakrishnan K, Read M, Pearson NM, Zerjal T, Webster MT, Zholoshvili I, Jamarjashvili E, Gambarov S, Nikbin B, Dostiev A, Aknazarov O, Zalloua P, Tsoy I, Kitaev M, Mirrakhimov M, Chariev A, Bodmer WF. The Eurasian heartland: a continental perspective on Y-chromosome diversity. Proc Natl Acad Sci U S A. 2001 Aug 28;98(18):10244-9. PMID: 11526236

43. Wood ET, Stover DA, Ehret C, Destro-Bisol G, Spedini G, McLeod H, Louie L, Bamshad M, Strassmann BI, Soodyall H, Hammer MF. Contrasting patterns of Y chromosome and mtDNA variation in Africa: evidence for sex-biased demographic processes. Eur J Hum Genet. 2005 Jul;13(7):867-76. PMID: 15856073

44. Zalloua PA, Xue Y, Khalife J, Makhoul N, Debiane L, Platt DE, Royyuru AK, Herrera RJ, Hernanz DF, Blue-Smith J, Wells RS, Comas D, Bertranpetit J, Tyler-Smith C; Genographic Consortium. Y-chromosomal diversity in Lebanon is structured by recent historical events. Am J Hum Genet. 2008 Apr;82(4):873-82. Epub 2008 Mar 27. PMID: 18374297

Non-Public: Abstract-only available
1. Adams SM, King TE, Bosch E, Jobling MA. The case of the unreliable SNP: recurrent back-mutation of Y-chromosomal marker P25 through gene conversion. Forensic Sci Int. 2006 May 25;159(1):14-20. Epub 2005 Jul 18. PMID: 16026953

1. Cadenas AM, Zhivotovsky LA, Cavalli-Sforza LL, Underhill PA, Herrera RJ. Y-chromosome diversity characterizes the Gulf of Oman. Eur J Hum Genet. 2008 Mar;16(3):374-86. Epub 2007 Oct 10. PMID: 17928816

2. Capelli C, Brisighelli F, Scarnicci F, Arredi B, Caglia' A, Vetrugno G, Tofanelli S, Onofri V, Tagliabracci A, Paoli G, Pascali VL. Y chromosome genetic variation in the Italian peninsula is clinal and supports an admixture model for the Mesolithic-Neolithic encounter. Mol Phylogenet Evol. 2007 Jul;44(1):228-39. Epub 2006 Dec 13. PMID: 17275346

3. Chaix R, Austerlitz F, Hegay T, Quintana-Murci L, Heyer E. Genetic traces of east-to-west human expansion waves in Eurasia. Am J Phys Anthropol. 2008 Jul;136(3):309-17. PMID: 18324635

4. Chaubey G, Metspalu M, Kivisild T, Villems R. Peopling of South Asia: investigating the caste-tribe continuum in India. Bioessays. 2007 Jan;29(1):91-100. PMID: 17187379

5. Csányi B, Bogácsi-Szabó E, Tömöry G, Czibula A, Priskin K, Csõsz A, Mende B, Langó P, Csete K, Zsolnai A, Conant EK, Downes CS, Raskó I. Y-chromosome analysis of ancient Hungarian and two modern Hungarian-speaking populations from the Carpathian Basin. Ann Hum Genet. 2008 Jul;72(Pt 4):519-34. Epub 2008 Mar 27. PMID: 18373723

6. Fechner A, Quinque D, Rychkov S, Morozowa I, Naumova O, Schneider Y, Willuweit S, Zhukova O, Roewer L, Stoneking M, Nasidze I. Boundaries and clines in the West Eurasian Y-chromosome landscape: Insights from the European part of Russia. Am J Phys Anthropol. 2008 May 9. [Epub ahead of print] PMID: 18470899

7. Francalacci P, Morelli L, Underhill PA, Lillie AS, Passarino G, Useli A, Madeddu R, Paoli G, Tofanelli S, Calò CM, Ghiani ME, Varesi L, Memmi M, Vona G, Lin AA, Oefner P, Cavalli-Sforza LL. Peopling of three Mediterranean islands (Corsica, Sardinia, and Sicily) inferred by Y-chromosome biallelic variability. Am J Phys Anthropol. 2003 Jul;121(3):270-9. PMID: 12772214

8. Gonçalves R, Freitas A, Branco M, Rosa A, Fernandes AT, Zhivotovsky LA, Underhill PA, Kivisild T, Brehm A. Y-chromosome lineages from Portugal, Madeira and Açores record elements of Sephardim and Berber ancestry. Ann Hum Genet. 2005 Jul;69(Pt 4):443-54. PMID: 15996172

9. Hammer MF, Chamberlain VF, Kearney VF, Stover D, Zhang G, Karafet T, Walsh B, Redd AJ. Population structure of Y chromosome SNP haplogroups in the United States and forensic implications for constructing Y chromosome STR databases. Forensic Sci Int. 2006 Dec 1;164(1):45-55. Epub 2005 Dec 5. PMID: 16337103

10. Hassan HY, Underhill PA, Cavalli-Sforza LL, Ibrahim ME. Y-chromosome variation among Sudanese: Restricted gene flow, concordance with language, geography, and history. Am J Phys Anthropol. 2008 Jul 10. [Epub ahead of print] PMID: 18618658

11. Karafet TM, Mendez FL, Meilerman MB, Underhill PA, Zegura SL, Hammer MF. New binary polymorphisms reshape and increase resolution of the human Y chromosomal haplogroup tree. Genome Res. 2008 May;18(5):830-8. Epub 2008 Apr 2. PMID: 18385274

12. King RJ, Ozcan SS, Carter T, Kalfoğlu E, Atasoy S, Triantaphyllidis C, Kouvatsi A, Lin AA, Chow CE, Zhivotovsky LA, Michalodimitrakis M, Underhill PA. Differential Y-chromosome Anatolian influences on the Greek and Cretan Neolithic. Ann Hum Genet. 2008 Mar;72(Pt 2):205-14. PMID: 18269686

13. Klarić IM, Salihović MP, Lauc LB, Zhivotovsky LA, Rootsi S, Janićijević B. Dissecting the molecular architecture and origin of Bayash Romani patrilineages: Genetic influences from South-Asia and the Balkans. Am J Phys Anthropol. 2008 Sep 11. [Epub ahead of print] PMID: 18785634

14. Lappalainen T, Laitinen V, Salmela E, Andersen P, Huoponen K, Savontaus ML, Lahermo P. Migration waves to the Baltic Sea region. Ann Hum Genet. 2008 May;72(Pt3):337-48. Epub 2008 Feb 19. PMID: 18294359

15. Marjanovic D, Fornarino S, Montagna S, Primorac D, Hadziselimovic R, Vidovic S, Pojskic N, Battaglia V, Achilli A, Drobnic K, Andjelinovic S, Torroni A, Santachiara-Benerecetti AS, Semino O. The peopling of modern Bosnia-Herzegovina: Y-chromosome haplogroups in the three main ethnic groups. Ann Hum Genet. 2005 Nov;69(Pt 6):757-63. PMID: 16266413

16. McEvoy B, Simms K, Bradley DG. Genetic investigation of the patrilineal kinship structure of early medieval Ireland. Am J Phys Anthropol. 2008 Aug;136(4):415-22. PMID: 18350585

17. Mohyuddin A, Ayub Q, Underhill PA, Tyler-Smith C, Mehdi SQ. Detection of novel Y SNPs provides further insights into Y chromosomal variation in Pakistan. J Hum Genet. 2006;51(4):375-8. Epub 2006 Feb 10. PMID: 16470330

18. Nasidze I, Ling EY, Quinque D, Dupanloup I, Cordaux R, Rychkov S, Naumova O, Zhukova O, Sarraf-Zadegan N, Naderi GA, Asgary S, Sardas S, Farhud DD, Sarkisian T, Asadov C, Kerimov A, Stoneking M. Mitochondrial DNA and Y-chromosome variation in the caucasus. Ann Hum Genet. 2004 May;68(Pt 3):205-21. PMID: 15180701

19. Neto D, Montiel R, Bettencourt C, Santos C, Prata MJ, Lima M. The African contribution to the present-day population of the Azores Islands (Portugal): analysis of the Y chromosome haplogroup E. Am J Hum Biol. 2007 Nov-Dec;19(6):854-60. PMID: 17712788

20. Novelletto A. Y chromosome variation in Europe: continental and local processes in the formation of the extant gene pool. Ann Hum Biol. 2007 Mar-Apr;34(2):139-72. Review. PMID: 17558587

21. Saha A, Sharma S, Bhat A, Pandit A, Bamezai R. Genetic affinity among five different population groups in India reflecting a Y-chromosome gene flow. J Hum Genet. 2005;50(1):49-51. Epub 2004 Dec 21. PMID: 15611834

22. Zerjal T, Pandya A, Thangaraj K, Ling EY, Kearley J, Bertoneri S, Paracchini S, Singh L, Tyler-Smith C. Y-chromosomal insights into the genetic impact of the caste system in India. Hum Genet. 2007 Mar;121(1):137-44. Epub 2006 Oct 31. PMID: 17075717


Key Investigators:
Luigi Luca Cavalli-Sforza
 Stanford University, Stanford, California, USA
Daniel G. Bradley
 Trinity College, Dublin, Ireland
Cristian Capelli
 Universita’ Cattolica del Sacro Cuore, Rome, Italy
Michael F. Hammer
 University of Arizona, Tucson, Arizona, USA
Rene J. Herrera
 Florida International University, Miami, Florida, USA
Mark A. Jobling
 University of Leicester, Leicester, United Kingdom 
Ornella Semino
 Università di Pavia, Pavia, Italy
Mark Stoneking
Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
 Chris Tyler-Smith
  The Welcome Trust Sanger Institute, Hinxton, United Kingdom
Peter A. Underhill
 Stanford University, Stanford, California, USA
Richard Villems
 University of Tartu and Estonian Biocentre, Tartu, Estonia

Need to cite this tutorial in your essay, paper or website? Use the following format:

Learn about Y-DNA Haplogroup R. Genebase Tutorials. Retrieved November 23, 2014, from http://www.genebase.com/learning/article/11
Test your DNA markers today!
Get Test »
  • DNA tests starting from only $119
  • Search for immediate family lines
  • Receive instant match notifications when new matches are found
The ads below are provided by Google.
Other Tutorials
The Y-DNA SNP Haplogroup Backbone Test Panel contains 19 SNP markers throughout the Y-DNA. These 19 SNP markers are the defining markers for an individual’s Y-DNA haplogroup.
Your Y-DNA haplotype is the specific set of results obtained after testing a set of STR markers on your Y-DNA.
The Y-DNA Test examines several different STR Marker Types.
Find out what's new in Version 2 of the I Subclade Test Panel.
As the research in I subclades progresses, the scientific community routinely renames existing subclades to accommodate rapid growth of the Y-DNA phylogenetic tree.
Learn how Y-DNA Haplogroup G helped shape present day Middle Eastern societies and how it plays a significant role in the peopling of modern day India.
Individuals who have taken the Haplogroup R Subclade test may benefit from selectively testing newly discovered SNPs that are relevant to their particular subclade.
Discover the different types of genetic markers found in the Y-DNA and how it allows us to trace our paternal lineage.
Dates of discovery for SNPs that define subclades downstream of R1b (M343+) are listed.
Unlike all of the other chromosomes, the Y-Chromosome is unique because it is passed down relatively unchanged along the male lineage and thus holds valuable information about a male’s ancestry.
DYS464 is an unique Y-DNA STR marker which is known to have 4 to 7 alleles (a to d for 4 or a to g for 7).
Our discussion will cover human history that dates back more than 65,000 years (65kya) and encompasses a large number of major empires and events in Asian history.
MRCA stands for “Most Recent Common Ancestor”. When comparing two individuals, the MRCA is the most recent ancestor from which the two individuals descended.
With strong traces in Northern Europe, this group has made a great impact in Europe, even playing a large role in Viking ancestry.
DNA Haplogroup E is the most prominent group for individuals of African descent.
The majority of Y-DNA haplogroup L can be found within the Indian subcontinent, accounting for a large proportion of Indian Y-chromosomes.
Haplogroup O, defined by SNP marker M175, is thought to have appeared in East Asia approximately 35,000 years ago. Today, Haplogroup O can be detected across Asia and Oceania.
As research into the R subclades progresses at a rapid pace, the scientific community routinely renames existing subclades to accommodate the rapid growth of the Y-DNA phylogenetic tree.
Y-DNA STR markers available at Genebase and the corresponding motifs used for allele designation in Version 3.5.
Learn how to compare Y-DNA markers between 2 different individuals.
Learn about the steps are involved to obtain your Y-DNA haplotype.
Y-DNA Haplogroup J has strong Middle Eastern roots and has played a large part in shaping populations throughout Europe.
Commercial DNA testing laboratories follow different nomenclature for determining their marker values. The only accurate and reliable method to determine conversions required between different...
People whose ancestors are from the western coast of Europe often share in common a small group of Y-Chromosome STR markers. The group of Y-Chromosome markers which are frequently found in western...
It's the dominant group of Europe, playing one of the largest roles in shaping modern day European populations.
Y-DNA Haplogroup Q is widespread at low frequencies throughout the Middle East, Asia and Siberia, and at high frequencies in the Americas.
As research into the J subclades progresses at a rapid pace, the scientific community routinely renames existing subclades to accommodate the rapid growth of the Y-DNA phylogenetic tree.
Y-DNA STR markers mutate at a rate of approximately one mutation every 20 generations. The relatively rapid mutation rate of STR markers compared to the slow mutation rate of SNP markers makes STR...
A number of STR markers can be tested on the Y-DNA. The more markers that are tested, the more discriminating the matches when comparing to other individuals.