Learn about Y-DNA Haplogroup E

Alternative content

The ads below are provided by Google.
Publications
by Genebase Users

Y-DNA Haplogroup E “Roots in Africa, Branches Beyond”

All males can trace their Y-DNA lineage back to a theoretical Y-DNA prototype, which originated in Africa and is thought to have migrated out of Africa over 60,000 years ago (60kya).   Although the Y-DNA is usually inherited from father to son without any changes, occasionally differences arise via mutations.  Since these mutations or variations add up through generations, the more differences that are found when comparing DNA, equals more time elapsed and more genetic distance.  Armed with information about the rates, types and number of variations we can create lineage maps or phylogenetic trees and make calculated estimates to trace our roots back to our forebears (e.g. to find the time to most recent common ancestor, TMRCA aka coalescence).




Figure 1.  Figure of Trees.  The first figure is an African acacia tree silhouette and symbolically shows the root of human males in Africa with their subsequent variation and flourishing in the rest of the world.  Next to this is a figure of the major Y-DNA haplogroups, with their phylogenetic relationships.

Ancestral Markers


The Y-DNA contains two main types of ancestral markers:
  1. SNPs (single nucleotide polymorphisms). SNPs are a change in a single nucleotide in a chromosome and occur infrequently; once they occur they are stable and typically define a whole chromosome and become its signature.
  2. STRs (short tandem repeats, aka microsatellites) (see Figure 2). STRs change by the number of repeats and change at a much faster rate than SNPs.

By testing the combination of SNPs and STRs in our Y-DNA, we can gain information on our paternal ancestry, ranging from ancient history (thousands and tens of thousands of years ago) with the much slower mutating SNPs, to recent history (100-1000 years ago) with faster mutating STRs.  More simply, SNPs allow us to track ancient or deep ancestry, while STRs allow us to track recent ancestry in the range of immediate family history over several generations and the relatively modern use of surnames. (see Figure 3). 

Figure 2.  The Y-chromosome is shown as a cytogenetic banded chromosome or ideogram.  The shorter p-arm and a portion of the longer q-arm of the Y-chromosome harbor most of the variation as indicated below.  The centromere constriction divides the two arms. 



Figure 3.  A schematic timeline is shown with estimations of ancestry determined through STR and SNP variation shown above.   BC/AD marks the division at the ‘common era’ beginning ~2,000 years ago (kya).


Y-DNA Haplogroups

Haplogroups are groups or a population derived from a common ancestor.  Y-DNA Haplogroups are defined by slowly evolving SNPs, and each SNP identifies a particular paternal haplogroup or branch of the Y-DNA phylogenetic tree.  (Note: mtDNA SNPs are used to determine haplogroups for maternal lineages). 

By contrast, the faster changing STRs are employed to determine haplotypes for the Y-DNA, where haplotypes are defined as a collection of variations in STR markers observed on the Y-DNA and can be thought of as a signature, one which tracks more recent genetic history.  Frequent haplotypes, commonly known as modal haplotypes can often be associated with defined populations and geographical regions, and can be informative or predictive of haplogroups that also show geographical preferences.  For example, from your haplotype determined through the Genebase Y-DNA STR Marker Test, you may already have a prediction of deeper genetic origins and a prediction of your Y-DNA haplogroup.            

There are 20 major Y-DNA haplogroups (designated by the letters A through T) stemming in a branching fashion from the Y-DNA prototype, aka “Y-DNA Adam” (haplogroup A), which may be seen as the root or trunk of the tree (see Figure 4).  Each branch and haplogroup after “Y-DNA Adam” is defined by a novel SNP or genetic change.  The Genebase Y-DNA backbone SNP Test Panel is used to determine Y-DNA haplogroups and additional panels are available to further resolve Y-DNA lineage into sub-haplogroups or subclades. 



Figure 4.
  This phylogenetic tree is presented in a more typical inverted fashion, with the A-T haplogroups defined by SNP markers as they branch from the root.  In order to identify your personal haplogroup, simply follow the branches from the ‘enter’ point with the SNPs identified, here exemplified in green for Haplogroup E.        




The Haplogroup E story …so far

Early origins
Y-DNA Haplogroup E originated in Africa and is an early branch of the Y-DNA lineage.  Only three genetic variations (SNPs) on the Y-DNA of Haplogroup E separate it from the founding Y-DNA Haplogroup A , aka ‘Y-DNA Adam’.  Haplogroup E is abundant in all regions in Africa and in some populations it makes up nearly 100% of the Y-DNA.  It is a highly diversified haplogroup with many subclades that have unique histories and distributions, including regions outside of Africa in nearby Asia and Europe.  Haplogroup E and its subclades encompass episodes from early Stone Age migrations to the spread of Bantu farmers 2,000-4,000 years ago to recent slave-trading in the 19th century that brought men with this haplogroup to the New World. 



Figure 5.
  Overall worldwide frequency and distribution of Y-DNA Haplogroup E.  Haplogroup E dominates in Africa and has moderate frequency in neighboring Asia and Mediterranean Europe.  Its presence in North America reflects the well known slave trade from the 17th to 19th centuries and it is not present in the indigenous populations in the Americas (e.g. see South America).  The frequency is an approximation taken from the average of several representative studies.
 

Haplogroup E is primarily found in Africa, but it also exists in Europe and Asia (see Figure 5).  Apart from North America, its levels are very low or it is completely absent elsewhere in the world.  Its presence in North America is largely due to the slave trade from ~1650-1850AD that brought Africans to America and not due to indigenous or native Americans as they lack this haplogroup.  

The Haplogroup E branch of Y chromosomes is identified by the presence of SNPs M40 and M96 (and others; SRY4064, SRY8299, and P29).  E1 is the predominant subclade, while E2 is much less frequent.  Within E1, E1b1 (defined by SNP P2) is the most abundant and widespread representative, and accounts for most of Haplogroup E worldwide.  E1b1 lineages vary in abundance over Africa and three main regions are evident from the distribution peaks of three subclades: E1b1a (SNP M2) in Sub-Saharan Africa, E1b1b1a (SNP M78) in East Africa and E1b1b1b (SNP M81) in Northwest Africa.   The difference in geographic location of Haplogroup E subclades also aligns with distinct language groups supporting the idea that there is prevailing father to son transmission of language in Africa.  This very general picture has recently yielded to finer views of African lineages and migrations through the definition and study of additional subclades and more remote branches of Haplogroup E.  Indeed, this is one most frequently revised Y-DNA haplogroups with several upgrades of its subclade nomenclature.  To help avoid confusion with this terminology, we always include the informative or defining SNP with each subclade as these remain constant over revision of subclades names and phylogenetic arrangements. 



Figure 6.
The emergence of modern humans.  A schematic timeline is shown with the approximate appearance of Homo sapiens, with particular attention to the estimated origins of different ancestral Y-DNA haplogroups and subclades of Haplogroup E.  Below the timeline is shown key geological and anthropological events.   kya = thousand years ago, hg = haplogroup, LGM = last glacial maximum.



SNPy trails and the spread of Haplogroup E
The leading hypothesis concerning the birth of Haplogroup E is that it originated in Northeast Africa and is one of the first emigrations of modern humans out of Africa to other parts of the world.  However, its shared phylogeny with Haplogroup D (see Figure 4), which is not in Africa and found in the Middle East, may indicate that Haplogroup E first appeared in the Middle East and the migration back to Africa is responsible for its prevalence here.  The TMRCA for Haplogroup E is 37 ±10kya and its subclades diverged from ~28-2kya (see Figure 6 and 13).  The simplest (or most parsimonious) explanation is that it arose in Northeast Africa and subsequently spread from this location to all parts of Africa, where it is clearly the dominant Y haplogroup.  The spread of Haplogroup E was also part of an early colonization of the Middle East and later Europe (see Figures 7-9).  As a result, Egypt served many times as a crossroads for the ancestors in Haplogroup E. 


The movements of human populations bearing haplogroup E:


Figure 7.  A map depicting the movements of human populations bearing haplogroup E.  A miniature phylogenetic tree legend with color coding for SNPs and subclades is shown at the left.  The origin of this Y-chromosome is most likely in East Africa, near Ethiopia.  Migrations to the north carried Haplogroup E towards Egypt and the Levantine corridor to the MidEast (M35 peach path).  Another migration is proposed to carry Haplogroup E derivatives, such M81, to the West and establish populations in North Africa (dark green path). The MidEast population is thought to develop new subclades, such as M123 (bright green) and V13 (orange), that are part of the colonization of Europe via Anatolia and the Balkan Peninsula, as well as further into Asia and the Arabian Peninsula (dark blue paths for M34 and V22).  There is also evidence for migration back into Africa (M78, M123, M34, V22) and crossing of the Mediterranean Sea by subclades of Haplogroup E (dark greens, M81 and V65).  The populations with Haplogroup E subclades in West Central Africa, primarily Bantu groups with the M2 subclade, dispersed this subclade extensively in Sub-Saharan and Equatorial Africa (beige paths). The expansion also reached South Africa, with the Eastern route (shown with M2 branch M191 in teal) perhaps being favored over the Western (dashed path).  The M2 population is also shown leaving Africa from the West coast to indicate the slave ports that were used from ~1650-1850AD.  The routes and locations are based on evidence from several studies of Y-chromosome E haplogroups and haplotypes and represent one possible scenario for the ancestral origin and propagation of this haplogroup.
  



Frequency and distribution of Haplogroup E in Africa:


Figure 8. A map illustrating the frequency and distribution of Haplogroup E in Africa.  The frequency of haplogroup E is shown as the blue portion of the pie charts distributed over different locations.  The highest concentration of this haplogroup in West Central Africa, where is accounts for nearly all Y-DNA observed.    



Frequency and distribution of Haplogroup E in Europe and portions of Asia and Africa:


Figure 9. A map illustrating the frequency and distribution of Haplogroup E in Europe and portions of Asia and Africa.  The portion of the pie charts colored in blue represent the fraction of Haplogroup E among Y-DNA in region.  While Africa displays the highest overall frequencies, the next highest frequencies are found around the Balkan Peninsula, including Greece and Crete.  The frequency of Haplogroup E drops precipitously to the North and East. 
    



How the Subclades of Y-DNA Haplogroup E are determined

The further refinement of Y-DNA ancestry can be obtained by using the Y-DNA Haplogroup E Subclade Testing Panel from Genebase.  This panel is based upon a collection of SNPs (see Table 1) that identify the sub-branches or subclades of Y-DNA Haplogroup E. 

Table 1. Haplogroup E Subclades and their defining SNPs
 
Location of SNP SNP Haplogroup E subclade

M40 G > A E

M132 G > T E1a    
M44 G > C E1a1
P2 C > T E1b1
M2 A > G E1b1a
M58 G > A E1b1a1 E1b1a E1
M116.2 A > C E1b1a2
M149 G > A E1b1a3
M154 T > C E1b1a4
M155 G > A E1b1a5
M10 T > C E1b1a6
M191 T > G E1b1a7
U174 G > A E1b1a7a
U175 G > A E1b1a8
U209 C > T E1b1a8a
U290 T > A E1b1a8a1
U181 C > T E1b1a8a1a
P59 A > G E1b1a8a2
P268 T > A E1b1a9
M215 A > G E1b1b E1b1b
M35 G > C E1b1b1
M78 C > T E1b1b1a
V12 A > G E1b1b1a1
M224 T > C E1b1b1a1a
V32 G > C E1b1b1a1b
V13 G > C E1b1b1a2
V27 A > T E1b1b1a2a
V22 T > C E1b1b1a3
M148 A > G E1b1b1a3a
V19 T > C E1b1b1a3b
V65 G > T E1b1b1a4
M81 C > T E1b1b1b
M107 A > G E1b1b1b1
M183 A > C E1b1b1b2
M165 A > G E1b1b1b2a
M123 G > A E1b1b1c
M34 G > T E1b1b1c1
M84 A del E1b1b1c1a
M290 C > T E1b1b1c1b
M281 G > A E1b1b1d
V6 G > C E1b1b1e
M329 G > C E1b1c    
P75 G > A E1b2
M75 G > A E2 E2
M41 G > T E2a
M54 G > A E2b
M85 C > A E2b1
M200 G > A E2b1a

The following diagram depicts the current, deeply branching phylogenetic tree for Y-DNA Haplogroup E:



Figure 10.  The current phylogenetic tree for Y-DNA haplogroup E and its subclades
 


The procedure for identifying your Y-DNA Haplogroup E Subclade is as follows:

Your Y-DNA Subclade will be automatically determined for you after your Sublcade test is completed.  However, if you are interested in finding out how your subclade was determined, just follow these steps:

Step 1.  Examine your test results from the Genebase Y-chromosome Haplogroup E Subclade Testing Panel.  Keep track of all your positive or derived SNP states and consult the Haplogroup E Subclade phylogenetic tree diagram (see Figure 9) or Table 1. 

Step 2.  Start with the root or main branch of Haplogroup E, which is ascertained by the presence of SNP M40.  According to your test results, follow the branches with your SNPs from the Genebase Y-chromosome Haplogroup E Subclade Testing Panel.  The point at which you no longer have mutations to follow is the branch or subclade of Haplogroup E to which you belong!

Geographical Distribution of the Subclades of Y-DNA Haplogroup E

The following reference maps illustrate how the various subclades of Y-DNA Haplogroup E are distributed.

By looking at the major subclade frequencies, three broad regions of Africa can be defined: Northwest, East and Sub-Saharan Africa.  The division can be distinguished by the prevalence of E1b1a (M81) in North, E1b1b (M2, M191) in Sub-Saharan Africa and E1b1b1a (M78) in East Africa.  Mali may represent an intermediate between Northwest and Sub-Saharan Africa. Note that even finer resolution is possible when STR information (i.e. haplotype patterns) and more recently identified SNPs are taken into account.  There are several strong correlations between geography or linguistic groups and the E subclade distribution patterns.  Thus, while Haplogroup E is broadly identified with Africa, informative SNP and subclade identification is beginning to paint a much more defined picture of ancestry and evolution in this continent and beyond.


Frequency and distribution of the Subclades of Haplogroup E in Africa:


Figure 11. A map illustrating the frequency and distribution of the Subclades of Haplogroup E in Africa.  The total frequency of Haplogroup E is shown as the blue portion of the smaller pie charts, while the larger pie chart shows the fraction of each subclade contributing to the total frequency.  For several countries, two charts are shown, which are derived from two different studies of Haplogroup E subclades in this region.   See Table 3 for a detailed account of these frequency and distribution of Subclade E.  Paragroup E* represents M40+ status or other E-defining SNPs, but lacking further subclade marker identification.  Likewise, paragroup E1b1b1* represents M35+  status, but lacking further subclade identification.



Frequency and distribution of the Subclades of Haplogroup E in Eurasia and Northeast Africa:


Figure 12. A map illustrating the frequency and distribution of the Subclades of Haplogroup E in Eurasia and Northeast Africa.  The total frequency of Haplogroup E is shown as the blue portion of the smaller pie charts, while the larger pie chart shows the fraction of each subclade contributing to the total frequency.  For Egypt, two charts are shown, which are derived from two different studies of Haplogroup E subclades in this region.   See Table 3 for a detailed account of these frequency and distribution of Subclade E.  Paragroup E* represents M40+ status or other E-defining SNPs, but lacking further subclade marker identification.  Likewise, paragroup E1b1b1* represents M35+  status, but lacking further subclade identification.



Frequency and distribution of the Subclades of Haplogroup E E1b1b1a/M78 Subclade in Eurasia and Northeast Africa:


Figure 13. A map illustrating the frequency and distribution of the Subclades of Haplogroup E E1b1b1a/M78 Subclade in Eurasia and Northeast Africa.  The total frequency of E1b1b1a/M78 Subclade is shown as the blue portion of the smaller pie charts, while the larger pie chart shows the fraction of each subclade contributing to this frequency.  In the legend, the a, b and g symbols correspond to microsatellite (haplotype) clusters that had been previously used to subclassify the M78 subclade.   Paragroup E1b1b1a*/M78 represents M78+ status or other E-defining SNPs, but lacking further subclade identification.  See Table 3 for a detailed account of these frequency and distribution of E subclades.


Frequency and distribution of the Subclades of Haplogroup E in African Americans in the United States:

Figure 15. A map illustrating the frequency and distribution of the Subclades of Haplogroup E in African Americans in the United States.  The total frequency of Haplogroup E is shown as the blue portion of the smaller pie charts, while the larger pie chart shows the fraction of each subclade contributing to this frequency.  Three charts are shown from three different studies.  See Table 3 for a detailed account of these frequency and distribution of E subclades.

Detailed Accounts for the Subclades of Y-DNA Haplogroup E

The following section provides information on individual Haplogroup E subclades.  This is gathered from the current literature and will be updated as ongoing studies are released and progress in this field is made available.  See Figures 10-12 for a representation of the frequency and distribution of the subclades of Haplogroup E. 

E1a. M132

The E1a subclade (SNP M132) has been scattered over Africa, but usually at low frequencies.  Reports suggest that it is highest in Sub-Saharan Africa, though it has been observed in North Africa (Morocco, Algeria, Egypt).  Its highest level has been reported in Mali (34%) and it has a reported TMRCA of ~14kya.

E1a1. M44

The E1a1 (M44) subclade has been detected in the Fulbe population in Cameroon at 53%.  2-5% levels have been observed in Mali and Sudan, but no other countries or populations have been reported to carry the subclade.  Check this site regularly for updates on this subclade as new information will be posted as studies become available.

E1b1. P2

The E1b1 (P2) subclade is a large and prevalent branch in Haplogroup E.  It most likely originated in East Africa, perhaps 28kya.  The P2* paragroup appears to be restricted to East Africa, principally Ethiopia (10-15%).  Levels elsewhere are ≤5%.  Most studies have addressed the descendent subclades of E1b1 (e.g. E1b1a (M2), E11b1 (M35), etc.).  

E1b1a.  M2

E1b1a (M2) is prevalent throughout Africa, except in North Africa. It peaks in West Africa and is associated with the spread of agriculture or new farming methods by the Bantu to Sub-Saharan and Equatorial Africa regions, where it especially prevalent.   The Bantu migration and dispersal of E1b1a (M2) appears to have reached as far as South Africa.  This Bantu-mediated spread is relatively recent, having taken place within the last ~3,000 years, although a TMRCA points to an ancient origin at 19kya. 

The observed distribution also suggests that the Bantu (E1b1a/M2) migration did not take go north of Kenya and moved south along the Southeastern (Swahili) Coast of Africa.  A barrier of the Cushitic language and culture in Northeast Africa has been proposed to explain the limited introgression of the Bantu E1b1a/M2 subclade in these northern regions.  Nevertheless, the spread of the Bantu is fairly extensive and their linguistic family (Niger-Congo) is the most widely dispersed language family in Africa, supporting the Y-chromosome evidence for the spread of the Bantu people through wide portions of Africa and providing a strong example of correlation between language and phylogeny in Africa.  On the other hand, the widespread distribution of Bantu and the E1b1a/M2 subclade, is responsible for a reduced geographic structure or the correlation between Y-chromosome phylogeny and a specific geographic location, thus acting to somewhat homogenize the populations.  

The E1b1a/M2 subclade in Oman may be due to recent slave trade with Africa, but since M2 is highest in the West (e.g. Senegal) and drops off significantly to the North and East, it has been speculated that these slaves must have come from a fairly far distance in Central or West Africa. 

E1b1a/M2 is also the most common Y haplogroup in African Americans (50-75%), a result of slave trade from Sub-Saharan Africa.  In South America, the estimates are ~8% for the M2 subclade.  Subclades of E1b1a (defined by SNPs U181, M291, U174, U290, U175) have been examined only in African and European American populations, where they are present in the former and absent in the latter.  U174 or E1b1a7a is the most prevalent at about 24% of African Americans. 

E1b1a1. M58

Defined by the M58 SNP, the E1b1a1 subclade appears to be a minor subclade.  It has been found in South Africa, the Rimaibe in Burkina Faso and Bantu (~5%) and the Hutu in Rwanda (10%).  A low level, ~1-2%, of this subclade has also been detected in studies of African Americans in the United States.

E1b1a2. M116.2

The E1b1a2 (M116.2) subclade has not been studied extensively. It appears with low frequency in Mali (2.3%).  Check this site regularly for updates on this subclade as new information will be posted as studies become available.

E1b1a3. M149

The E1b1a3 (M149) subclade has not been studied extensively. It appears with low frequency in South Africa (~2%).  Check this site regularly for updates on this subclade as new information will be posted as studies become available.

E1b1a4. M154

The E1b1a4 (M154) subclade has been found within Cameroon (9%) and South African (~4%) regions.  This distribution likely reflects dispersal by Bantu farmers on the route from Central West Africa to South Africa.  Check this site regularly for updates on this subclade as new information will be posted as studies become available.

E1b1a5. M155

The E1b1a5 (M155) subclade has not been studied extensively. It appears with low frequency in Mali (2.3%).  Check this site regularly for updates on this subclade as new information will be posted as studies become available.

E1b1a6. M10

The E1b1a6 subclade defined by SNP M10, is concentrated in Central Africa: Cameroon (7-11%), Central African Republic (25% Lissongo) and Tanzania (2%).  It has not been found elsewhere in a limited set of studies.  Check this site regularly for updates on this subclade as new information will be posted as studies become available. 

E1b1a7. M191

The E1b1a7 (M191) subclade is closely associated with the phylogeography of the precursor E1b1a (M2) and the Bantu population that is responsible for its dissemination.  It has a frequency peak (15-45%) in a belt through Sub-Saharan and Equatorial Africa.  Modest amounts have been detected in the Arabian Peninsula (3-6%).  It is also represented by the high frequency of its descendants, e.g. subclade E1b1a7a (U174) in African Americans.  

E1b1a7a. U174

E1b1a7a (U174) is the most prevalent subclade in the African Americans, occurring in 24% of African Americans.  As of this writing, it has yet to be studied or reported outside of this population. Check this site regularly for updates on this subclade as new information will be posted as studies become available.   

E1b1a8. U175

The E1b1a8 (U175) subclade is an abundant branch in African American populations.  Outside of its descendants, E1b1a8a1 (U290) and E1b1a8a1a (U181), the paragroup E1b1a8 accounts for ~8.5% of African American Y-chromosomes.  With these descendants, this branch encompasses >23% of African American Y-chromosomes.  It has yet to be studied or reported outside of this population. Check this site regularly for updates on this subclade as new information will be posted as studies become available.

E1b1a8a. U209

Currently, no information is available for the distribution and frequency of this haplogroup E subclade, although information on descendants (E1b1a8a1 and E1b1a8a1a) is available. Check this site regularly for updates on this subclade as new information will be posted as studies become available.

E1b1a8a1. U290

E1b1a8a1 (U290) is the second most prevalent subclade in the African Americans, occurring in 11% of African Americans.  It has yet to be studied or reported outside of this population. Check this site regularly for updates on this subclade as new information will be posted as studies become available.

E1b1a8a1a. U181

E1b1a8a1a (U181) is a derivative of the E1b1a8a1 (U290) subclade and is found in 3-4% of African Americans.  It has yet to be studied or reported outside of this population. Check this site regularly for updates on this subclade as new information will be posted as studies become available.

E1b1a8a2. P59

Currently, no information is available for the distribution and frequency of this haplogroup E subclade. Check this site regularly for updates on this subclade as new information will be posted as studies become available.

E1b1a9. P268

Currently, no information is available for the distribution and frequency of this haplogroup E subclade. Check this site regularly for updates on this subclade as new information will be posted as studies become available.

E1b1b. M215

Ethiopia and Sudan harbor the highest levels (30-40%) of the E1b1b (M215) subclade. The information on the E1b1b (M215) subclade is generally superseded by the information from the descendant lineages.  Based on the profile of its distribution and the degree of STR diversity in this subclade, it is believed to originate in East Africa.  The TMRCA estimate is 20-26kya and by 17kya this subclade had migrated to Northeast Africa.  It may be that the Nile River Valley acted as a migratory corridor for this subclade and some of its important descendants described below.  This also fits with its higher prevalence among Nilo-Saharan language groups versus Afro-Asiatic language groups.     

E1b1b1. M35

This subclade is most frequently reported as the paragroup E1b1b1* or M35* to distinguish it from the observations of the descendant subclades (e.g. E1b1b1a, M78 or E1b1b1b, M81 or E1b1b1c, M123).  The TMRCA for M35* subclade is estimated at 27-29kya. 

The E1b1b1*/M35* subclade can be found in both western and eastern regions of Africa, but clearly has much higher frequency in East Africa (25-50%).  This trend is opposite to E1b1a M2 frequency and distribution.  The limit of E1b1a/M2 in Northeast Africa was suggested to be result of close knit cultures of Cushitic language groups, which harbor a large fraction of the E1b1b1/M35 lineage, thus giving an explanation for low E1b1a/M2 and high E1b1b1/M35 frequencies in Northeast Africa.

The M35 predecessors, P2 and M215 are also thought to have an East Africa origin based on STR variation.  M35* and M78 have been found in Europe and the Middle East and may have participated in the demic diffusion of agriculture during the Neolithic Era.  M35* is found in East Africa (e.g. Ethiopia) and is absent in Oman and Egypt, so the M35 descendants in Oman are likely to have more recent origins as evidenced by the presence of the subsequent SNP variations and the E1b1b1/M35 descendant subclades (E1b1b1a, M78 or E1b1b1b, M81 or E1b1b1c, M123).  The STR variation in Egypt is greater than Oman, pointing to an older establishment of M35 in Egypt and supporting the notion that the Levantine corridor through Egypt was the route for the spread of M35 lineages in the Middle East.  The timing for this migration coincides with the Mesolithic Era.  It is found in present day countries of Lebanon (16%), Turkey (11%), Iraq (11%) and surrounding regions.  

An interesting note is that the extent of E1b1b1* (M35*) to the South is near the proposed migration of the M2 subclade through Kenya and that Tanzania has a mixed contribution of both the ‘West M2’ and ‘East M35*’ subclades.  This mixture has a unique chronology in that the introduction of M2 by the Bantu is a recent admixture episode in comparison to a Stone Age origin for the M35* subclade.

In Europe, the E1b1b1*/M35* subclade is more prevalent in the Ashkenazi Jewish population (20%) than the non-Jewish population (6%), possibly indicating a founding role for the E1b1b1*/M35* subclade for the Ashkenazi Jews in Europe. 

n.b. recent studies have identified a new SNP, M293 that account for many of the M35* paragroup.  This new subclade, designated E1b1b1f, appears to have a concentration around Tanzania (43%), the country that harbored the highest reported frequency of M35* (37%).  The E1b1b1f/M293 subclade has a TMRCA estimated at 10kya and is associated with a more recent migration (~2kya) and spread of pastoralism (livestock herding) southward to South Africa.  Along with the E1b1a/M2/Bantu, this provides another instance of demic diffusion of new technologies in Africa.

E1b1b1a. M78

The Northeast Africa-based E1b1b1a subclade is defined by SNP M78.  Somalia, Sudan and Egypt are among the present day countries with very high frequencies (60-90%) of the E1b1b1a M78 subclade.  The STR data also support its origin in this area with a TMRCA estimated at 14-23 kya.  The frequency of this subclade drops dramatically in Sub-Saharan Africa.

The E1b1b1a (M78) subclade of Haplogroup E predominates in Europe wherever Haplogroup E is found.   Since this haplogroup is most frequent in East Africa, it is likely connected to Africa via the Middle East and the Levantine corridor through Egypt.  The exit from Africa is estimated within the Mesolithic era.  The route to Europe continued through Anatolia and used the Balkan Peninsula, e.g. Greece, in the expansion of this subclade to the West.  This movement appears to closely parallel (in place and time) those taken by Y-chromosome Haplogroup J.  Together, Haplogroup J and E are believed to have spread agricultural practices during the Neolithic Era to Europe from the Near and Middle East. 

Given the presence of E1b1b1a/M78 in North Africa, it is likely that the migration north also produced a western trek from Ethiopia or Sudan into this area.  There may have also been backflow of this haplogroup into Africa during the Neolithic, again bringing with it new agricultural techniques into Egypt.  

Note that M78 SNP is the second highest representative in the Balkans (~23%).  There is a moderate geographic structure in that the frequency of E1b1b1a/M78 is higher in the South (Greece, Macedonia, Albania, Serbia) than the North (Croatia, Bosnia).  Low to very low frequencies (<5%) are seen in Iran and Pakistan and these tend to the southern regions of these two countries.  A moderate frequency (6%) has been detected in the Atlantic island group of the Azores (Portugal).

The M78 subclade has been sub-divided into clusters a, b, g and d by STR haplotype (microsatellite) analyses and these were recently shown to correlate well with new SNPs that also further subdivided and refined M78 subclade. (See the E1b1b1a1 derivatives below).  The E1b1b1a* or M78* paragroup, which constitutes roughly 1% of the E1b1b1a M78 lineage, is largely restricted to North Africa and corresponds closely to the b microsatellite cluster that was found here.

E1b1b1a1. V12

E1b1b1a1 (V12) shows the highest frequency in Northeast Africa (e.g. up to 44% in South Egypt and 19% in Sudan).  It may have migrated from Egypt in the North, south to Sudan along the Nile River Valley.  It is not present at levels >5% elsewhere, except ~6% in Basques from France.  TMRCA estimates an origin at 14-15kya.       

E1b1b1a1a. M224

The E1b1b1a1a (M224) subclade has not been studied extensively.  It has been found in Israel among Yemeni population (5%) and in one individual in West Asia and appears to be a minor subclade.  Check this site regularly for updates on this subclade as new information will be posted as studies become available.

E1b1b1a1b. V32

The E1b1b1a1b (V32) subclade is a descendant of E1b1b1a1 (V12).  E1b1b1a1b/V32 is highest in Somalia (47-75%), Sudan (52%) and Ethiopia (40%).  All these chromosomes detected to date fall into the East African M78 g microsatellite cluster, which is associated with Cushitic (Afro-Asiatic) language groups in Somalia, Ethiopia and Kenya.   There is some notion that the Great Rift Valley acted as a barrier to isolate language and genetic groups in this region.  This subclade is abundant in Somalia, although the STR diversity is rather low.  This data would suggest that the E1b1b1a1b/V32 Somali population was shaped by a founder effect, somewhat recently.  The E1b1b1a1b/V32 was not found in Turkish population with M78 g microsatellite cluster, indicating that there is not a perfect correspondence between the M78 g microsatellite cluster and the E1b1b1a1b/V32 subclade.  The g cluster is characterized by a short and unique DYS19 allele (11 repeats), a situation in which it acts almost as a unique event polymorphism like a SNP.  The estimates for the TMRCA of this subclade are approximately 4-8kya. 
 

E1b1b1a2. V13

E1b1b1a2 (V13) is highest in the Balkan Peninsula (Albania 32%, Greece 18-45%) and diminishes from here northward.  It overlaps with the previously identified E1b1b1a M78 a cluster, and makes up the majority of E chromosome types in Europe.  This cluster was found primarily in Europe and to a lesser extent the Near East (e.g. 5% Turkey).  It is rarely found outside of Europe.  

For example, the most prevalent E subclade in Crete was defined as the E1b1b1a M78 a cluster.  It is quite likely that Greece was the source of this subclade on Crete and that this subclade is common overall in the Aegean region.  The estimates for the TMRCA of this subclade are 9-11kya outside of Europe (i.e. Near East) and 4-5kya in Europe.  The expansion time for the E1b1b1a2 (V13) subclade in Greece is estimated around 4-9kya, somewhat preceding the estimate for the origin of this subclade, which is due to the use of different mutation rate models.  The estimate for expansion on Crete is 3kya, which coincides with an influx of Mycenaean culture from the Greek mainland during the end of the Bronze Age.  The E1b1b1a2 (V13) most closely follows the route proposed for Y-chromosome haplogroup J-M12 that was part of the late Neolithic introduction of farming and agriculture to Europe and the advances of the ensuing Bronze Age.

E1b1b1a2a. V27

The E1b1b1a2a (V27) subclade has not been studied extensively. It has been found in one individual and appears to be a very minor subclade.  Check this site regularly for updates on this subclade as new information will be posted as studies become available.

E1b1b1a3. V22

E1b1b1a3 (V22) is at its peak frequency in parts of Northeast Africa (e.g. Egypt 4-20%).  It has significant frequencies many other locations (Ethiopia 25%, Sudan 23%, Kenya 11%, Morocco7%). Like E1b1b1a1 (V12), this subclade may have migrated south from Egypt to Sudan.  It has also been found in Sicily, Turkey, United Arab Emirates and many other locations, making it a fairly far flung subclade. The estimates for the TMRCA of this subclade are 9-11kya.

E1b1b1a3a. M148

The E1b1b1a3a (M148) subclade has not been studied extensively. It has been found in one individual in South Asia and appears to be a very minor subclade.  Check this site regularly for updates on this subclade as new information will be posted as studies become available.

E1b1b1a3b. V19

The E1b1b1a3b (V19) subclade has not been studied extensively. It has been found in one individual and appears to be a very minor subclade.  Check this site regularly for updates on this subclade as new information will be posted as studies become available.

E1b1b1a4. V65

E1b1b1a4 (V65) is mainly found in Morocco and Libya, i.e. North Africa.  This subclade is made up of the previously identified M78 b microsatellite cluster (and one g-cluster chromosome).  It has likely arisen recently (~4kya) in North or Northeast Africa.  It is found across the Mediterranean Sea, making it another candidate like the E1b1a/M81 subclade (see below) for transit over the Mediterranean Sea into Europe.
 

E1b1b1b. M81

The E1b1b1b (M81) subclade is prominent among Berber populations (65-80%), many of which are concentrated in Northwest Africa.  It is also prominent in Muslim populations, but at a reduced frequency (30-50%).  The prominence of M81 in North Africa among Berber groups also links it with the Afro-Asiatic language groups that are generally restricted to North Africa. 

The E1b11b1b/M81 subclade has a TMRCA of 4-9kya and expansion around 2kya.  It is therefore relatively new or perhaps recently emerged from a bottleneck and is subject to genetic drift (high frequency of genetic signature, low complexity or variation) in its isolation.  This latter notion may fit with a model for the geographic isolation in Northwest Africa where the Sahara desert separates this population from the South and the Mediterranean Sea separates it from the North.  The origin of this subclades dates to a ‘wet Sahara’ period that followed the end of the Ice Ages, and the expansion of the population dates after this period in the desertification of the Sahara.  This could fit a scenario where the M81 subclade population was isolated after its founding and witnessed little gene flow because of the Sahara desert barrier.  

A minor amount is found in Iberia and Sicily, which almost surely arrived from Northwest Africa sources.  The presence of M81 (and M35 and M78) in the Iberian Peninsula, albeit at much lower frequencies, argues for a limited by tangible connection between North Africa and Iberia.  The variation of Y-chromosome haplotypes also supports this claim (as does the presence of European subclades in NW Africa).  It should be noted that not all E haplogroup members in Iberia are the E1b1b1b/M81 subclade and that there are a fraction of M78 subclade that likely traveled from mainland Europe.   The presence of M78 and M81 SNPs in the Portugal should be linked to their presence in the Azores in the Atlantic Ocean.

E1b1b1b1. M107

The E1b1b1b1 (M107) subclade has been detected at a low level (2-3%) in two countries – Mali and Algeria.  It has not been studied extensively.  Check this site regularly for updates on this subclade as new information will be posted as studies become available.

E1b1b1b2. M183

Currently, no information is available for the distribution and frequency of this Haplogroup E subclade. Check this site regularly for updates on this subclade as new information will be posted as studies become available.

E1b1b1b2a. M165

The only reported existence of the E1b1b1b2 subclade – defined by SNP M165 – is in the Middle East (4.4%).  It may be a minor subclade, but has not been studied extensively.  Check this site regularly for updates on this subclade as new information will be posted as studies become available. 

E1b1b1c. M123

The E1b1b1c subclade (M123) is found at its highest frequency outside of Africa, in the Middle East (12% Oman), Near East (12%, Turkey) and Europe (13%, Italy), although substantial levels (11%) have been reported for Ethiopia.  As E1b1b1c/M123 is found in Turkey (Anatolia), it provides another geographic link and probably migratory route between the Middle East and Europe.  In Anatolia, and Europe in general, E1b1b1c/M123 decreases in the northward direction.  Based on variation of STR sites, divergence and expansion estimates for the E1b1b1c/M123 subclade are in the Mesolithic Era (~11kya), which predates the Neolithic spread of agriculture from the Mid-East and it could indicate high early diversity in the founding population in Anatolia.  Subsequently, this subclade is likely to have participated in the Neolithic diffusion of agriculture to parts of Europe.  Interestingly, it is found abundantly in Ashkenazi and Sephardic Jews (12%). 

E1b1b1c1. M34

The E1b1b1c1 (M34) subclade is found primarily in East Africa and the Middle East.  It appears to be the major derivative of the E1b1b1c (M123) subclade described above.  The E1b1b1c1/M34 subclade likely arose in the Middle East and back migrated into Africa.  In Africa it is highest in Ethiopia (23%).

It was noted at moderate frequencies (5-20%) in Jewish populations (Ashkenazi, Ethiopian, Libyan and Yemeni) in Israel, providing additional evidence for a link between the Middle East and North Africa.   The E1b1b1c1/M34 subclade is also present in Anatolia, Iberia, Sardinia and Crete (frequencies hovering at 5%).
 

E1b1b1c1a. M84

Currently, no information is available for the distribution and frequency of this haplogroup E subclade. Check this site regularly for updates on this subclade as new information will be posted as studies become available.

E1b1b1c1b. M290

The only recorded occurrence of the E1b1b1c1b (M290) subclade is in a Palestinian population (5%).  This subclade has not been studied extensively.  Check this site regularly for updates on this subclade as new information will be posted as studies become available.
 

E1b1b1d. M281

To date, the E1b1b1d (M281) subclade has only been observed in Ethiopia at a minor frequency (3%).  Since this subclade has not been studied extensively, check this site regularly for updates on this subclade as new information will be posted as studies become available.

E1b1b1e. V6

his somewhat rare haplogroup, E1b1b1e (V6), has only been observed in East Africa with the most appreciable levels seen in Ethiopia (4-17%).  Kenya and Somalia also harbor a moderate frequency (5%) of this subclade.

E1b1c. M329

The E1b1c (M329) subclade was only observed in Ethiopia and Qatar at a minor frequency (1-3%).  Since this subclade has not been studied extensively, check this site regularly for updates on this subclade as new information will be posted as studies become available.

E1b2. P75

Currently, no information is available for the distribution and frequency of this Haplogroup E subclade. Check this site regularly for updates on this subclade as new information will be posted as studies become available.

E2. M75

The E2 (M75) subclade is generally limited to the Central West and South African regions.  Peak levels appear in Burkina Faso (11%) and it is associated with the Bantu population.  The Bantu population is responsible for a significant dispersal of agriculture practices in Africa and has likely contributed to Pygmy and Khoisan-speaking populations in many parts of Africa.  E2/M75 has therefore had a role, minor in comparison to E1, in demic diffusions and admixture in several populations in this continent.

E2a. M41

The E2a (M41) subclade can be found in East Africa in the region delimited by Ethiopia, Sudan, Kenya, Tanzania, the Democratic Republic of Congo and Rwanda.  The highest levels have been reported in the Democratic Republic of Congo (~50%), while much lower levels (2-17%) are reported in the other aforementioned countries.

E2b. M54

The E2b (M54) subclade is widely distributed at low levels (2-6%).  Present day countries with E2b/M54 populations include Mali, Benin, Rwanda, Burkina Faso, Senegal, Cameroon, Democratic Republic of Congo, Uganda and Namibia.  One study has found an increased frequency in some South African populations (Zulu and Xhosa 21-28%).  A small fraction of these chromosomes have been detected in the Middle East: Oman (2%) and Qatar (3%).  Approximately 3% of African American men carry this subclade.
 

E2b1. M85

The E2b1 (M85) subclade has been detected at high levels in Burkina Faso (Rimaibe 27%) and Cameroon (Daba 22%).  Lower levels are present in South Africa (!Kung and Khoe ~5%).  The connection between Cameroon and South Africa is reminiscent of the Bantu diaspora.  The E2b1/M85 subclade has not been studied extensively and little other information is available regarding its frequency and distribution.  Check this site regularly for updates on this subclade as new information will be posted as studies become available.

E2b1a. M200

Currently, the only reported population with the E2b1 (M200) subclade is the Mbuti Pygmies in the Democratic Republic of Congo (25%). Since this subclade has not been studied extensively, check this site regularly for updates on this subclade as new information will be posted as studies become available.
 

Modal Haplotypes Associated with Y-DNA Haplogroup E

Your unique set of Y-DNA STR markers obtained through the Y-DNA STR test is referred to as your "Haplotype".  This is not to be confused with your "Haplogroup" which is determined by testing the SNP markers in your Y-DNA through Y-DNA SNP backbone and subclade testing.

When the Y-DNA STR markers are tested for large groups of people from around the world, the haplotypes which occur with the highest frequencies within certain populations are called "Modal Haplotypes". 

For example, the M78 subclade has been classified into four clusters (a, b, g, and d) based on the STR or microsatellite clustering pattern and three of these have strong ties to location (see Figure 12).   V65 and M78a are primarily found in Europe, V65 and M78b primarily in North Africa and V32 and M78g in East Africa.  The modal haplotypes listed from Tunisia, are not confined to Haplogroup E and therefore have influence from other Y-chromosome haplogroups.

Confirmation of haplogroup assignment is always made by SNP testing.  Conversely, haplogroup assignment does not indicate that you will have the modal haplotype, recalling the fact that STRs are rapidly changing markers.  Table 2 provides a list of modal haplotypes associated with Haplogroup E.

Table 2.  Modal Haplotypes associated with Haplogroup E
Arredi et al. 2004 Am. J. Hum. Genet.  Cinnioglu et al. 2004 Hum. Genet. Cruciani et al. 2007 Mol. Biol. Evol. Pereira et al. 2002  Ann. Hum. Genet.  Sanchez et al. 2005 Eur. J. Hum. Genet.  Zalloua et al. 2008 Am. J. Hum. Genet. http://www.yhrd.org/


Table 3

Resources/Bibliography

Public: Full article PDF available
1. Alonso S, Flores C, Cabrera V, Alonso A, Martín P, Albarrán C, Izagirre N, de la Rúa C, García O. The place of the Basques in the European Y-chromosome diversity landscape. Eur J Hum Genet. 2005 Dec;13(12):1293-302. PMID: 16094307

2. Al-Zahery N, Semino O, Benuzzi G, Magri C, Passarino G, Torroni A, Santachiara-Benerecetti AS. Y-chromosome and mtDNA polymorphisms in Iraq, a crossroad of the early human dispersal and of post-Neolithic migrations. Mol Phylogenet Evol. 2003 Sep;28(3):458-72. PMID: 12927131

3. Arredi B, Poloni ES, Paracchini S, Zerjal T, Fathallah DM, Makrelouf M, Pascali VL, Novelletto A, Tyler-Smith C. A predominantly neolithic origin for Y-chromosomal DNA variation in North Africa. Am J Hum Genet. 2004 Aug;75(2):338-45. Epub 2004 Jun 16. PMID: 15202071

4. Balanovsky O, Rootsi S, Pshenichnov A, Kivisild T, Churnosov M, Evseeva I, Pocheshkhova E, Boldyreva M, Yankovsky N, Balanovska E, Villems R. Two sources of the Russian patrilineal heritage in their Eurasian context. Am J Hum Genet. 2008 Jan;82(1):236-50.  PMID: 18179905

5. Behar DM, Garrigan D, Kaplan ME, Mobasher Z, Rosengarten D, Karafet TM, Quintana-Murci L, Ostrer H, Skorecki K, Hammer MF. Contrasting patterns of Y chromosome variation in Ashkenazi Jewish and host non-Jewish European populations. Hum Genet. 2004 Mar;114(4):354-65. Epub 2004 Jan 22. PMID: 14740294

6. Bortolini MC, Salzano FM, Thomas MG, Stuart S, Nasanen SP, Bau CH, Hutz MH, Layrisse Z, Petzl-Erler ML, Tsuneto LT, Hill K, Hurtado AM, Castro-de-Guerra D, Torres MM, Groot H, Michalski R, Nymadawa P, Bedoya G, Bradman N, Labuda D, Ruiz-Linares A. Y-chromosome evidence for differing ancient demographic histories in the Americas. Am J Hum Genet. 2003 Sep;73(3):524-39. Epub 2003 Jul 28. PMID: 12900798

7. Bosch E, Calafell F, González-Neira A, Flaiz C, Mateu E, Scheil HG, Huckenbeck W, Efremovska L, Mikerezi I, Xirotiris N, Grasa C, Schmidt H, Comas D. Paternal and maternal lineages in the Balkans show a homogeneous landscape over linguistic barriers, except for the isolated Aromuns. Ann Hum Genet. 2006 Jul;70(Pt 4):459-87. PMID: 16759179

8. Bosch E, Calafell F, Comas D, Oefner PJ, Underhill PA, Bertranpetit J. High-resolution analysis of human Y-chromosome variation shows a sharp discontinuity and limited gene flow between northwestern Africa and the Iberian Peninsula.  Am J Hum Genet. 2001 Apr;68(4):1019-29. Epub 2001 Mar 14. PMID: 11254456

9. Capelli C, Redhead N, Abernethy JK, Gratrix F, Wilson JF, Moen T, Hervig T, Richards M, Stumpf MP, Underhill PA, Bradshaw P, Shaha A, Thomas MG, Bradman N, Goldstein DB.  A Y chromosome census of the British Isles. Curr Biol. 2003 May 27;13(11):979-84. PMID: 12781138

10. Capelli C, Redhead N, Romano V, Calì F, Lefranc G, Delague V, Megarbane A, Felice AE, Pascali VL, Neophytou PI, Poulli Z, Novelletto A, Malaspina P, Terrenato L, Berebbi A, Fellous M, Thomas MG, Goldstein DB. Population structure in the Mediterranean basin: a Y chromosome perspective. Ann Hum Genet. 2006 Mar;70(Pt2):207-25. PMID: 16626331

11. Cinnioğlu C, King R, Kivisild T, Kalfoğlu E, Atasoy S, Cavalleri GL, Lillie AS, Roseman CC, Lin AA, Prince K, Oefner PJ, Shen P, Semino O, Cavalli-Sforza LL, Underhill PA. Excavating Y-chromosome haplotype strata in Anatolia. Hum Genet. 2004 Jan;114(2):127-48. Epub 2003 Oct 29. PMID: 14586639

12. Cruciani F, Santolamazza P, Shen P, Macaulay V, Moral P, Olckers A, Modiano D, Holmes S, Destro-Bisol G, Coia V, Wallace DC, Oefner PJ, Torroni A, Cavalli-Sforza LL, Scozzari R, Underhill PA. A back migration from Asia to sub-Saharan Africa is supported by high-resolution analysis of human Y-chromosome haplotypes. Am J Hum Genet. 2002 May;70(5):1197-214. Epub 2002 Mar 21. PMID: 11910562

13. Cruciani F, La Fratta R, Santolamazza P, Sellitto D, Pascone R, Moral P, Watson E, Guida V, Colomb EB, Zaharova B, Lavinha J, Vona G, Aman R, Cali F, Akar N, Richards M, Torroni A, Novelletto A, Scozzari R. Phylogeographic analysis of haplogroup E3b (E-M215) y chromosomes reveals multiple migratory events within and out of Africa. Am J Hum Genet. 2004 May;74(5):1014-22. Epub 2004 Mar 24. PMID: 15042509

14. Cruciani F, La Fratta R, Torroni A, Underhill PA, Scozzari R.  Molecular dissection of the Y chromosome haplogroup E-M78 (E3b1a): a posteriori evaluation of a microsatellite-network-based approach through six new biallelic markers. Hum Mutat. 2006 Aug;27(8):831-2. PMID: 16835895

15. Cruciani F, La Fratta R, Trombetta B, Santolamazza P, Sellitto D, Colomb EB, Dugoujon JM, Crivellaro F, Benincasa T, Pascone R, Moral P, Watson E, Melegh B, Barbujani G, Fuselli S, Vona G, Zagradisnik B, Assum G, Brdicka R, Kozlov AI, Efremov GD, Coppa A, Novelletto A, Scozzari R. Tracing past human male movements in northern/eastern Africa and western Eurasia: new clues from Y-chromosomal haplogroups E-M78 and J-M12. Mol Biol Evol. 2007 Jun;24(6):1300-11. Epub 2007 Mar 10. PMID: 17351267

16. Flores C, Maca-Meyer N, González AM, Oefner PJ, Shen P, Pérez JA, Rojas A, Larruga JM, Underhill PA. Reduced genetic structure of the Iberian peninsula revealed by Y-chromosome analysis: implications for population demography. Eur J Hum Genet. 2004 Oct;12(10):855-63. PMID: 15280900

17. Luis JR, Rowold DJ, Regueiro M, Caeiro B, Cinnioğlu C, Roseman C, Underhill PA, Cavalli-Sforza LL, Herrera RJ. The Levant versus the Horn of Africa: evidence for bidirectional corridors of human migrations. Am J Hum Genet. 2004 Mar;74(3):532-44. Epub 2004 Feb 17. Erratum in: Am J Hum Genet. 2004 Apr;74(4):788. PMID: 14973781

18. Martinez L, Mirabal S, Luis JR, Herrera RJ. Middle Eastern and European mtDNA lineages characterize populations from eastern Crete. Am J Phys Anthropol. 2008 May 23. [Epub ahead of print] PMID: 18500747

19. Nasidze I, Quinque D, Ozturk M, Bendukidze N, Stoneking M. MtDNA and Y-chromosome variation in Kurdish groups. Ann Hum Genet. 2005 Jul;69(Pt 4):401-12. PMID: 15996169

20. Paracchini S, Pearce CL, Kolonel LN, Altshuler D, Henderson BE, Tyler-Smith C. A Y chromosomal influence on prostate cancer risk: the multi-ethnic cohort study. J Med Genet. 2003 Nov;40(11):815-9. PMID: 14627670

21. Pericić M, Lauc LB, Klarić IM, Rootsi S, Janićijevic B, Rudan I, Terzić R, Colak I, Kvesić A, Popović D, Sijacki A, Behluli I, Dordevic D, Efremovska L, Bajec DD, Stefanović BD, Villems R, Rudan P. High-resolution phylogenetic analysis of southeastern Europe traces major episodes of paternal gene flow among Slavic populations. Mol Biol Evol. 2005 Oct;22(10):1964-75. Epub 2005 Jun 8. PMID: 15944443

22. Regueiro M, Cadenas AM, Gayden T, Underhill PA, Herrera RJ. Iran: tricontinental nexus for Y-chromosome driven migration. Hum Hered. 2006;61(3):132-43. Epub 2006 Jun 12. PMID: 16770078

23. Rosa A, Ornelas C, Jobling MA, Brehm A, Villems R. Y-chromosomal diversity in the population of Guinea-Bissau: a multiethnic perspective. BMC Evol Biol. 2007 Jul 27;7:124. PMID: 17662131

24. Sanchez JJ, Hallenberg C, Børsting C, Hernandez A, Morling N. High frequencies of Y chromosome lineages characterized by E3b1, DYS19-11, DYS392-12 in Somali males. Eur J Hum Genet. 2005 Jul;13(7):856-66. PMID: 15756297

25. Semino O, Magri C, Benuzzi G, Lin AA, Al-Zahery N, Battaglia V, Maccioni L, Triantaphyllidis C, Shen P, Oefner PJ, Zhivotovsky LA, King R, Torroni A, Cavalli-Sforza LL, Underhill PA, Santachiara-Benerecetti AS. Origin, diffusion, and differentiation of Y-chromosome haplogroups E and J: inferences on the neolithization of Europe and later migratory events in the Mediterranean area. Am J Hum Genet. 2004 May;74(5):1023-34. Epub 2004 Apr 6. PMID: 15069642

26. Semino O, Santachiara-Benerecetti AS, Falaschi F, Cavalli-Sforza LL, Underhill PA. Ethiopians and Khoisan share the deepest clades of the human Y-chromosome phylogeny. Am J Hum Genet. 2002 Jan;70(1):265-8. Epub 2001 Nov 20. PMID: 11719903

27. Sengupta S, Zhivotovsky LA, King R, Mehdi SQ, Edmonds CA, Chow CE, Lin AA, Mitra M, Sil SK, Ramesh A, Usha Rani MV, Thakur CM, Cavalli-Sforza LL, Majumder PP, Underhill PA. Polarity and temporality of high-resolution y-chromosome distributions in India identify both indigenous and exogenous expansions and reveal minor genetic influence of Central Asian pastoralists. Am J Hum Genet. 2006 Feb;78(2):202-21. Epub 2005 Dec 16. PMID: 16400607

28. Shen P, Lavi T, Kivisild T, Chou V, Sengun D, Gefel D, Shpirer I, Woolf E, Hillel J, Feldman MW, Oefner PJ. Reconstruction of patrilineages and matrilineages of Samaritans and other Israeli populations from Y-chromosome and mitochondrial DNA sequence variation. Hum Mutat. 2004 Sep;24(3):248-60. PMID: 15300852

29. Sims LM, Garvey D, Ballantyne J. Sub-populations within the major European and African derived haplogroups R1b3 and E3a are differentiated by previously phylogenetically undefined Y-SNPs. Hum Mutat. 2007 Jan;28(1):97. PMID: 17154278

30. Underhill PA, Shen P, Lin AA, Jin L, Passarino G, Yang WH, Kauffman E, Bonné-Tamir B, Bertranpetit J, Francalacci P, Ibrahim M, Jenkins T, Kidd JR, Mehdi SQ, Seielstad MT, Wells RS, Piazza A, Davis RW, Feldman MW, Cavalli-Sforza LL, Oefner PJ. Y chromosome sequence variation and the history of human populations. Nat Genet. 2000 Nov;26(3):358-61.  PMID: 11062480

31. Wells RS, Yuldasheva N, Ruzibakiev R, Underhill PA, Evseeva I, Blue-Smith J, Jin L, Su B, Pitchappan R, Shanmugalakshmi S, Balakrishnan K, Read M, Pearson NM, Zerjal T, Webster MT, Zholoshvili I, Jamarjashvili E, Gambarov S, Nikbin B, Dostiev A, Aknazarov O, Zalloua P, Tsoy I, Kitaev M, Mirrakhimov M, Chariev A, Bodmer WF. The Eurasian heartland: a continental perspective on Y-chromosome diversity. Proc Natl Acad Sci U S A. 2001 Aug 28;98(18):10244-9. PMID: 11526236

32. Wood ET, Stover DA, Ehret C, Destro-Bisol G, Spedini G, McLeod H, Louie L, Bamshad M, Strassmann BI, Soodyall H, Hammer MF. Contrasting patterns of Y chromosome and mtDNA variation in Africa: evidence for sex-biased demographic processes. Eur J Hum Genet. 2005 Jul;13(7):867-76. PMID: 15856073

33. Zalloua PA, Xue Y, Khalife J, Makhoul N, Debiane L, Platt DE, Royyuru AK, Herrera RJ, Hernanz DF, Blue-Smith J, Wells RS, Comas D, Bertranpetit J, Tyler-Smith C; Genographic Consortium. Y-chromosomal diversity in Lebanon is structured by recent historical events. Am J Hum Genet. 2008 Apr;82(4):873-82. Epub 2008 Mar 27. PMID: 18374297

Non-Public: Abstract-only available
1. Beleza S, Gusmão L, Amorim A, Carracedo A, Salas A. The genetic legacy of western Bantu migrations. Hum Genet. 2005 Aug;117(4):366-75. Epub 2005 Jun 1. PMID: 15928903

2. Cadenas AM, Zhivotovsky LA, Cavalli-Sforza LL, Underhill PA, Herrera RJ. Y-chromosome diversity characterizes the Gulf of Oman. Eur J Hum Genet. 2008 Mar;16(3):374-86. Epub 2007 Oct 10. PMID: 17928816

3. Capelli C, Brisighelli F, Scarnicci F, Arredi B, Caglia' A, Vetrugno G, Tofanelli S, Onofri V, Tagliabracci A, Paoli G, Pascali VL. Y chromosome genetic variation in the Italian peninsula is clinal and supports an admixture model for the Mesolithic-Neolithic encounter. Mol Phylogenet Evol. 2007 Jul;44(1):228-39. Epub 2006 Dec 13. PMID: 17275346

4. Cruciani F, Trombetta B, Novelletto A, Scozzari R. Recurrent mutation in SNPs within Y chromosome E3b (E-M215) haplogroup: a rebuttal. Am J Hum Biol. 2008 Sep-Oct;20(5):614-6. PMID: 18449920

5. Csányi B, Bogácsi-Szabó E, Tömöry G, Czibula A, Priskin K, Csõsz A, Mende B, Langó P, Csete K, Zsolnai A, Conant EK, Downes CS, Raskó I. Y-chromosome analysis of ancient Hungarian and two modern Hungarian-speaking populations from the Carpathian Basin. Ann Hum Genet. 2008 Jul;72(Pt 4):519-34. Epub 2008 Mar 27. PMID: 18373723

6. Fernandes AT, Rosa A, Gonçalves R, Jesus J, Brehm A. The Y-chromosome short tandem repeats variation within haplogroup E3b: evidence of recurrent mutation in SNP. Am J Hum Biol. 2008 Mar-Apr;20(2):185-90. PMID: 17990327

7. Gonçalves R, Freitas A, Branco M, Rosa A, Fernandes AT, Zhivotovsky LA, Underhill PA, Kivisild T, Brehm A. Y-chromosome lineages from Portugal, Madeira and Açores record elements of Sephardim and Berber ancestry. Ann Hum Genet. 2005 Jul;69(Pt 4):443-54. PMID: 15996172

8. Hammer MF, Chamberlain VF, Kearney VF, Stover D, Zhang G, Karafet T, Walsh B, Redd AJ. Population structure of Y chromosome SNP haplogroups in the United States and forensic implications for constructing Y chromosome STR databases. Forensic Sci Int. 2006 Dec 1;164(1):45-55. Epub 2005 Dec 5. PMID: 16337103

9. Hassan HY, Underhill PA, Cavalli-Sforza LL, Ibrahim ME. Y-chromosome variation among Sudanese: Restricted gene flow, concordance with language, geography, and history. Am J Phys Anthropol. 2008 Jul 10. [Epub ahead of print] PMID: 18618658

10. Henn BM, Gignoux C, Lin AA, Oefner PJ, Shen P, Scozzari R, Cruciani F, Tishkoff SA, Mountain JL, Underhill PA. Y-chromosomal evidence of a pastoralist migration through Tanzania to southern Africa. Proc Natl Acad Sci U S A. 2008 Aug 5;105(31):10693-8. Epub 2008 Aug 4. PMID: 18678889

11. Karafet TM, Mendez FL, Meilerman MB, Underhill PA, Zegura SL, Hammer MF. New binary polymorphisms reshape and increase resolution of the human Y chromosomal haplogroup tree. Genome Res. 2008 May;18(5):830-8. Epub 2008 Apr 2. PMID: 18385274

12. King RJ, Ozcan SS, Carter T, Kalfoğlu E, Atasoy S, Triantaphyllidis C, Kouvatsi A, Lin AA, Chow CE, Zhivotovsky LA, Michalodimitrakis M, Underhill PA. Differential Y-chromosome Anatolian influences on the Greek and Cretan Neolithic. Ann Hum Genet. 2008 Mar;72(Pt 2):205-14. PMID: 18269686

13. Nasidze I, Ling EY, Quinque D, Dupanloup I, Cordaux R, Rychkov S, Naumova O, Zhukova O, Sarraf-Zadegan N, Naderi GA, Asgary S, Sardas S, Farhud DD, Sarkisian T, Asadov C, Kerimov A, Stoneking M. Mitochondrial DNA and Y-chromosome variation in the caucasus. Ann Hum Genet. 2004 May;68(Pt 3):205-21. PMID: 15180701

14. Neto D, Montiel R, Bettencourt C, Santos C, Prata MJ, Lima M. The African contribution to the present-day population of the Azores Islands (Portugal): analysis of the Y chromosome haplogroup E. Am J Hum Biol. 2007 Nov-Dec;19(6):854-60. PMID: 17712788

15. Pereira L, Gusmão L, Alves C, Amorim A, Prata MJ. Bantu and European Y-lineages in Sub-Saharan Africa. Ann Hum Genet. 2002 Nov;66(Pt 5-6):369-78. PMID: 12485470

16. Robino C, Crobu F, Di Gaetano C, Bekada A, Benhamamouch S, Cerutti N, Piazza A, Inturri S, Torre C. Analysis of Y-chromosomal SNP haplogroups and STR haplotypes in an Algerian population sample. Int J Legal Med. 2008 May;122(3):251-5. Epub 2007 Oct 2. PMID: 17909833


Key Investigators:
Luigi Luca Cavalli-Sforza
 Stanford University, Stanford, California, USA
Cristian Capelli
 Universita’ Cattolica del Sacro Cuore, Rome, Italy
Fulvio Cruciani
 Università di Roma, ‘La Sapienza’, Rome, Italy
Michael F. Hammer
 University of Arizona, Tucson, Arizona, USA
Rene J. Herrera
 Florida International University, Miami, Florida, USA
Mark A. Jobling
 University of Leicester, Leicester, United Kingdom
Pavao Rudan
 Institute for Anthropological Research, Zagreb, Croatia
Rosaria Scozzari
 Università di Roma, ‘La Sapienza’, Rome, Italy 
Ornella Semino
 Università di Pavia, Pavia, Italy
Mark Stoneking
Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
Chris Tyler-Smith
 The Wellcome Trust Sanger Institute, Hinxton, UK
Peter A. Underhill
 Stanford University, Stanford, California, USA

Need to cite this tutorial in your essay, paper or website? Use the following format:

Learn about Y-DNA Haplogroup E. Genebase Tutorials. Retrieved July 26, 2014, from http://www.genebase.com/learning/article/2
Test your DNA markers today!
Get Test »
  • DNA tests starting from only $119
  • Search for immediate family lines
  • Receive instant match notifications when new matches are found
The ads below are provided by Google.
Other Tutorials
The Y-DNA SNP Haplogroup Backbone Test Panel contains 19 SNP markers throughout the Y-DNA. These 19 SNP markers are the defining markers for an individual’s Y-DNA haplogroup.
Your Y-DNA haplotype is the specific set of results obtained after testing a set of STR markers on your Y-DNA.
The Y-DNA Test examines several different STR Marker Types.
Find out what's new in Version 2 of the I Subclade Test Panel.
As the research in I subclades progresses, the scientific community routinely renames existing subclades to accommodate rapid growth of the Y-DNA phylogenetic tree.
Learn how Y-DNA Haplogroup G helped shape present day Middle Eastern societies and how it plays a significant role in the peopling of modern day India.
Individuals who have taken the Haplogroup R Subclade test may benefit from selectively testing newly discovered SNPs that are relevant to their particular subclade.
Discover the different types of genetic markers found in the Y-DNA and how it allows us to trace our paternal lineage.
Dates of discovery for SNPs that define subclades downstream of R1b (M343+) are listed.
Unlike all of the other chromosomes, the Y-Chromosome is unique because it is passed down relatively unchanged along the male lineage and thus holds valuable information about a male’s ancestry.
DYS464 is an unique Y-DNA STR marker which is known to have 4 to 7 alleles (a to d for 4 or a to g for 7).
Our discussion will cover human history that dates back more than 65,000 years (65kya) and encompasses a large number of major empires and events in Asian history.
MRCA stands for “Most Recent Common Ancestor”. When comparing two individuals, the MRCA is the most recent ancestor from which the two individuals descended.
With strong traces in Northern Europe, this group has made a great impact in Europe, even playing a large role in Viking ancestry.
DNA Haplogroup E is the most prominent group for individuals of African descent.
The majority of Y-DNA haplogroup L can be found within the Indian subcontinent, accounting for a large proportion of Indian Y-chromosomes.
Haplogroup O, defined by SNP marker M175, is thought to have appeared in East Asia approximately 35,000 years ago. Today, Haplogroup O can be detected across Asia and Oceania.
As research into the R subclades progresses at a rapid pace, the scientific community routinely renames existing subclades to accommodate the rapid growth of the Y-DNA phylogenetic tree.
Y-DNA STR markers available at Genebase and the corresponding motifs used for allele designation in Version 3.5.
Learn how to compare Y-DNA markers between 2 different individuals.
Learn about the steps are involved to obtain your Y-DNA haplotype.
Y-DNA Haplogroup J has strong Middle Eastern roots and has played a large part in shaping populations throughout Europe.
Commercial DNA testing laboratories follow different nomenclature for determining their marker values. The only accurate and reliable method to determine conversions required between different...
People whose ancestors are from the western coast of Europe often share in common a small group of Y-Chromosome STR markers. The group of Y-Chromosome markers which are frequently found in western...
It's the dominant group of Europe, playing one of the largest roles in shaping modern day European populations.
Y-DNA Haplogroup Q is widespread at low frequencies throughout the Middle East, Asia and Siberia, and at high frequencies in the Americas.
As research into the J subclades progresses at a rapid pace, the scientific community routinely renames existing subclades to accommodate the rapid growth of the Y-DNA phylogenetic tree.
Y-DNA STR markers mutate at a rate of approximately one mutation every 20 generations. The relatively rapid mutation rate of STR markers compared to the slow mutation rate of SNP markers makes STR...
A number of STR markers can be tested on the Y-DNA. The more markers that are tested, the more discriminating the matches when comparing to other individuals.