Asian Ancestry based on Studies of Y-DNA Variation: Part 1 Early origins – roots from Africa and emergence in East Asia
Learning Center» Paternal Ancestry (Y-DNA)» Y-DNA Haplogroups (SNPs)» Asian Ancestry based on Studies of Y-DNA Variation: Part 1 Early origins – roots from Africa and emergence in East Asia

Alternative content

The ads below are provided by Google.
by Genebase Users

Asian Ancestry based on Studies of Y-chromosome Variation: an unfolding chapter of modern humans.

This article is Part 1 of a 3 part series.
Part 1.  Early origins – the emergence of modern humans in East Asia  <<== You are here
Part 2.  Additional migrations and reach.  Travels south to Oceania, north to the Americas and in between.
Part 3.  Recent demographics and ancestry of the male East Asians – Empires and Dynasties

“The journey of a thousand miles, starts with one step”  Lao-Tzu

This article focuses on the paternal ancestry of populations within Asia.  These populations inhabit present day countries of the People’s Republic of China (China), Russia (principally Siberia), Mongolia, Korea, Japan, Vietnam, Cambodia, Thailand, Laos, Myanmar, Malaysia, Indonesia, Singapore, Brunei and the Phillipines.  In brief, we will refer to this area as East Asia.  We will concentrate on populations in East Asia, as opposed Central Asia (CAS, e.g. Kazakstan) or South Asia (SAS, e.g. India).  However, since populations migrate through and beyond the East Asian regions designated above this can provide us with important information about origins and movements of Asian male populations and because the boundaries imposed are somewhat arbitrary, we will look at neighboring regions where it is informative to our discussion of ancestry in East Asia.

Figure 1.  Map of regions covered in this article.  Division of Asia into sections is shown by color-coding and particular geographic features that are discussed are labeled. 

Our discussion will cover human history that dates back more than 65,000 years (65kya) and encompasses a large number of major empires and events in Asian history.  For example, Genghis Khan (c. 1162-1227), the Mongol leader of what has been considered the largest empire in human civilization, has left his genetic stamp on a vast Asian territory based on evidence for a Y-chromosome legacy shared among ~16 million men and this modern conqueror.  He is possibly the single largest contributor to the male gene pool on earth today.  The chronicle of male ancestry of East Asia is also a complex story that has many mysteries and is still yielding new and fundamental information about the evolution of humans on earth. 

The leading theories and geographical landscape for the settlement of East Asia

Several competing theories have been proposed to account for the origin of modern humans (Homo sapiens sapiens) in East Asia.   The debate has moved past the question of the origin of modern humans in Africa or whether there is a trace of archaic humans (e.g. Homo erectus or Homo floresiensis) present in the modern male gene pool - across the world there is yet no concrete evidence of this - but rather how the first human males arrived in East Asia and spread through this part of the world.   Part of complexity in trying to understand this picture is untangling the sequence of events, which is made difficult when new events cover, erase or merge with older events.

Two general scenarios have been proposed.  One theory states that the first humans to inhabit East Asia followed a southern route along the Indian Ocean coast (the ‘coastal express’) and entered first in Southern East Asia (SEAS) and then moved to the north (Northern East Asia or NEAS).  (see Figure 2)  Another theory states that there were two early routes into East Asia; one north and one south leading to the ‘pincer movement’ name for this model.  Support for the ‘pincer model’ is based on differences noted among populations along the north-south axis in East Asia and the presence of haplogroups on each side with deep or ancient roots that separate these lineages.  Later, we will discuss the specific Y-chromosome haplogroups participating in these landmark events.      


Figure 2.  Proposed early migrations of male Homo sapiens sapiens populations into East Asia.  The map displays probable routes of migrations for modern humans into East Asia and their initial dispersal across this territory.  The scheme presented in this map shows a modified ‘pincer model’ with two main migrations into East Asia: the southern route emerging from Africa which took place first and a northern route that took place later and also deposited humans initially in SEAS.  The southern ‘coastal express’ route was taken by populations with Haplogroups C and D who were the earliest settlers of East Asia.  Northern migrations with populations bearing Haplogroups K descendants - M, N, O, P and Q – proceeded to the south and north. 

Dating calculations, although not precise, are generally in agreement that the first modern humans in East Asia arrived at least 60kya.  Note that the colonization of Asia predates the initial colonization of Europe by 10-20kya.   Europe was colonized by modern between 40-50kya and this was probably related to the colder climate to the north.  There have been archaeological findings that support presence of humans around 100kya in Australia and China, but the fossil remains are attributed to archaic hominid species, H. erectus or H. sapiens and not to modern humans H. s. sapiens who only appear around 40kya in the fossil record.  The hominid fossil record in East Asia between these times is remarkably sparse, especially in light of the abundance of finds before and after this ~60kya gap.  It has been suggested that modern humans did not successfully colonize East Asia before the eruption of the Mt. Toba supervolcano in Sumatra about 75kya.  There is no conclusive proof that this cataclysm contributed to the vanishing of archaic humans in East Asia, but the notion is not far-fetched.

With perhaps the exception of Papua New-Guinea, the early founders of modern humans in East Asia were primarily hunter-gatherers and agricultural practices did not begin until the Neolithic Era, some 10kya. The development of the cultivation of tuber and rice crops certainly helped to fuel population expansions that drove many of the later migrations across the Orient (see Parts 2 and 3).  Despite the advantages and success of the farming lifestyle, hunter-gatherer cultures have persisted in many regions of East Asia. Unfortunately, it is thought that the last purely nomadic hunter-gatherer societies on Earth, such as the Penan in the Sarawak state of Borneo (Malaysia), have begun to adopt farming practices and the earth may have witnessed the last true traces of the original mode of human subsistence. 

Studies of human ancestry in East Asia have found evidence along several lines that indicate a division between the south and the north, divided approximately by the Yangtze River in China.  The evidence can be found in genetic, anthropological, archaeological, linguistic and surname surveys, although we will confine most of our discussion to the genetic evidence and point out that this and other boundaries are often found as a continuum rather than an abrupt transition.

Currently, most evidence appears to favor the ‘coastal express’ model for the initial migration of modern human males into East Asia, which arrived first in the south and continued a coastal migration up northward.  One supposition supporting this model is that humans adapted to coastal and estuarine environments would find a similar habitat (climate, flora and fauna) throughout their migration, requiring little in the way of extraordinary innovation.  It is also thought that at this time humans were capable of moderate sea voyages and able to string along coastlines at perhaps a rate of 10km/day by paddling.    

Based on the dates for leaving Africa and arriving in East Asia, the rate of movement for these populations has been estimated at 1-4km/yr.  Because the coastline differed over this time, thanks to the glacier fluxuations of the Ice Ages, many land bridges existed which allowed relatively easy spread over many islands that currently exist in SEAS.  In particular, many islands of Indonesia; Sumatra, Java and Borneo where in found in a contiguous region known as the Sunda Shelf.  Other non-Sunda Shelf islands connected by a land bridge to the East Asia mainland during the height of the Ice Ages, which may have seen sea levels drop by as much as 150 meters, included Taiwan, and the Japanese archipelago. 

A major geological division, known as the Wallace Line, remained covered by water during the last glacial maximum (LGM, 18-20kya) between the SEAS Sunda shelf and Australia-New Guinea region (a southeastern counterpart dubbed the Sahul shelf), which presented a barrier to animal species spread and possibly also to the spread of H. s. sapiens.  The spread of H. s. sapiens was not denied by the water boundaries, since this was perhaps more convenient and familiar in some instances as noted above.  The routes taken were likely those with the least risk; literally choosing the path of least resistance.  These routes, of course, turned out to be the most successful and most witnessed.  In sum, debates concerning the early migration routes have surmised that modern humans probably moved from East Africa into East Asia by stringing along the southern coast 70-60kya using similar marine and estuarine resources for survival and expansion for which they were well adapted.  We will generally limit our discussion of Asian Ancestry to those regions northwest of the Wallace line (see Figure 1). 

The major Y-chromosome haplogroups in East Asia

The major founding Y-chromsome haplogroups associated with East Asia are C, D, O and N (see Figures 2-4).  This means that these haplogroups are found at high frequencies in East Asian populations (75-85%) and are somewhat unique to these groups.  Among these haplogroups, Haplogroup C is considered to be perhaps the earliest Y-chromosomal signature in East Asia, with strong evidence for Paleolithic Era origins (60 kya).  Haplogroup C is found at high levels in many populations in Asia, and has extended its reach to Australia and the Americas, which is consistent with its early origins in Asia and ultimate reach to these new frontiers.  Haplogroup D is also thought to have Paleolithic Era origins (60kya), but since this haplogroup is not as widespread in East Asia it is believed that it was largely replaced by other populations and haplogroups expanding in later eras (e.g. the Han culture).  Recent studies have supported a very early migration from south into the north for Haplogroup D.  Haplogroup O, which probably emerged in CAS around 35kya and appears somewhat later in East Asia, is the most uniquely East Asian Y haplogroup and is abundant in many populations (averaging 57% of all haplogroups in East Asian populations).  Its emergence during eras of robust population expansion and migrations across East Asian territories in conjunction with the advent of agriculture technology figured in the wide proliferation of this haplogroup among the East Asians.   Haplogroup N is found largely in Siberia in East Asia, but is found also in West Asia and Europe, which reflects bidirectional migration from an origin around northwest China or CAS (age ~20kya).  A brief summary of the prevalence of these major haplogroups in East Asia regions is provided in Figure 4.  Figures 5 and 6 provide a detailed view of the distribution of all the major Y-chromosome haplogroups in the countries and regions discussed.

Figure 3.  Phylogenetic tree of the major Y-chromosome Haplogroups.   The branching pattern of haplogroups defined with SNP mutations on the Y-chromosome are shown with the East Asian-specific haplogroups highlighted in green.       


Figure 4.  Summary of major haplogroup frequencies in East Asia and surrounding regions.  A bar graph is used to display average levels of major East Asian haplogroups. NEAS, Northern East Asia. SEAS, Southern East Asia. CAS, Central Asia.  SEA, SEAS excluding China.  SAS, South Asia.  Oceania comprises all the islands of Melanesia, Micronesia and Polynesia.  These regions are depicted in Figure 7. 

Origins in Southern East Asia

SEAS populations have been noted to have a high number of haplogroups and subclades, even though these haplogroups tend to be closely related to each other.  The early origin of human populations in SEAS is reflected in the high number of Y-chromosome types, particularly in South China which has been the subject of many studies. It is likely that the Paleolithic haplogroups, C and D, gained their first foothold in SEAS.

The most abundant subclade in SEAS populations is O3, which is defined by SNP M122 and makes up 42-44% of East Asian Y-chromosomes.  O3 has been suggested to originate in SEAS and studies have produced a wide range of dates that place its birth in the Paleolithic Era (25-60kya).  Based on results that show that many haplotypes in the O3 haplogroup are shared between SEAS and NEAS populations, it has been suggested that they have a recent common history and that the migration to the north (25-35kya) soon followed its establishment in the south.  The major expansion of populations in SEAS has been forecasted to be relatively late (10-15kya) during the birth of the Holocene Epoch and Neolithic Era that followed the LGM (18kya). The population expansions were most likely tied into major agricultural revolutions and from that point they likely proceeded at a rapid rate.  

Suggestions have been made that the O1 and O2 subclades also originate in SEAS where they are most prevalent.  Studies of the O1a* subclade in SEAS finds an age estimated near 34kya.  Within ethnic populations on Hainan Island, the estimate for origin of this subclade is 15-19kya and in Taiwan it is around 15kya (both near the LGM).  It seems reasonable to propose that O1a populations migrated from a mainland source close to the time where land bridges to these islands existed.  The O1 and O2 lineages are believed to have followed the eastern coastal migration route, while O3 may have moved north via an inland path to the west.  After land bridges were erased by rising sea levels generated through glacial melt, the Hainan Island populations were cutoff and witnessed genetic drift from the rest of the East Asian populations (For example, they completely lack D, P, N and Q haplogroups). 

The O2a subclade is abundant (15-80%) in SEAS and ethnic populations on Hainan Island.  Dating evidence for O2a subclades in SEAS (31kya) and Hainan Island (20-26kya), would support a scenario similar to the O1a migration via land bridge.  O2a is also common among the Mundari populations in India in SAS and with the high haplotype diversity observed in India, it has been argued that the O2a subclade originated here with a very ancient origin dating to 68kya. 

The founding of Northern East Asia populations

The number of NEAS haplogroup types observed is lower than that in SEAS, but they are spread further geographically and genetically.  The geographic spread may be a function of the wide range used by hunter-gatherers in this ecological region.  Compared to SEAS populations, NEAS populations have a lower population density and this could have led to isolation by distance and greater genetic drift between populations in NEAS.  The practice of patrilocality (male domiciles do not move) could have also reinforced the genetic separation between NEAS populations.  The major haplogroups found in NEAS populations are C, N, and P.  A study of expansion times in East Asia has found support for founding of populations in NEAS 22-34kya, placing their existence in the Paleolithic Era before the LGM (18-20kya) and noting that population expansion was probably slower than it was in SEAS.  It has been argued that Paleolithic populations in the north were able to survive in southern regions of Siberia that were not covered by glaciers during the LGM, and were supported there by a fauna-rich boreal forest.  Archaeological evidence supports a culture in Siberia ~43kya, near the Altai Mountain region, spreading further into the Sayan Mountains and Lake Baikal regions ~34kya and then across Siberia and Eurasia with advanced stone tool technology ~20kya.  This hunter-gatherer boreal lifestyle has persisted from the Pleistocene Epoch up until the present day in many parts of Siberia. 

The role of Central Asia
Central Asian (CAS) populations carry a very high level of haplogroup diversity, which are a result of their participation in many migrations events between north and south and east and west regions throughout history.  Populations from CAS also have clear ties to European male ancestry and thus provide a link between East Asia (NEAS) and Europe.  This connection was likely to play a significant role as trade developed over the ‘Silk Road’ in the more recent common era (~2kya).  One haplogroup that likely figured in the early exchange between West and East Asia in CAS was Haplogroup P that has a founding in the range of 25-35kya.   Haplogroup P may coincide with the populations emerging from the NEAS region encompassing the Altai-Sayan Mountains, Lake Baikal and Mongolia.   

Figure 5. Overall Y-chromosome haplogroup frequency and distribution in East Asia and surrounding regions.  Haplogroups C and N predominate in the North.  Haplogroups P, R and J are found in the West, and O in the South and East.  Haplogroup D is found in pockets in Tibet and Japan.  Oceania is the prime residence of Haplogroup M, which is infrequently found in East Asia.  Haplogroup Q is found in extreme northeast Russia, next to the Bering Strait and the crossing into the Americas, which also harbor this haplogroup.

Figure 6.  Top panel.  Y-chromosome haplogroup distribution in P.R. of China and Mongolia.  Haplogroups C and N favor the north.  Increasing levels of Haplogroup O are found to the south.  Haplogroups P, R and J are highest in the northwest. Haplogroup D lineage is found in Tibet and central China, which also display the highest diversity of haplogroups. 
Bottom panel.  Y-chromosome haplogroup distribution in Siberia.  Haplogroups N and C predominate in the Siberia.  The unique presence of Haplogroup Q can be seen in the Chukotka peninsula, foretelling the transit of this Haplogroup into North America.  Haplogroups P and R are highest on the east coast.  The most significant incidence of the prominent East Asian O haplogroup is in the Buryat to the south.

Continue to Part 2, Additional migrations and reach.  Travels south to Oceania north to the Americas and in between >>

Need to cite this tutorial in your essay, paper or website? Use the following format:

Asian Ancestry based on Studies of Y-DNA Variation: Part 1 Early origins – roots from Africa and emergence in East Asia. Genebase Tutorials. Retrieved April 25, 2014, from
Test your DNA markers today!
Get Test »
  • DNA tests starting from only $119
  • Search for immediate family lines
  • Receive instant match notifications when new matches are found
The ads below are provided by Google.
Other Tutorials
The Y-DNA SNP Haplogroup Backbone Test Panel contains 19 SNP markers throughout the Y-DNA. These 19 SNP markers are the defining markers for an individual’s Y-DNA haplogroup.
Your Y-DNA haplotype is the specific set of results obtained after testing a set of STR markers on your Y-DNA.
The Y-DNA Test examines several different STR Marker Types.
Find out what's new in Version 2 of the I Subclade Test Panel.
As the research in I subclades progresses, the scientific community routinely renames existing subclades to accommodate rapid growth of the Y-DNA phylogenetic tree.
Learn how Y-DNA Haplogroup G helped shape present day Middle Eastern societies and how it plays a significant role in the peopling of modern day India.
Individuals who have taken the Haplogroup R Subclade test may benefit from selectively testing newly discovered SNPs that are relevant to their particular subclade.
Discover the different types of genetic markers found in the Y-DNA and how it allows us to trace our paternal lineage.
Dates of discovery for SNPs that define subclades downstream of R1b (M343+) are listed.
Unlike all of the other chromosomes, the Y-Chromosome is unique because it is passed down relatively unchanged along the male lineage and thus holds valuable information about a male’s ancestry.
DYS464 is an unique Y-DNA STR marker which is known to have 4 to 7 alleles (a to d for 4 or a to g for 7).
Our discussion will cover human history that dates back more than 65,000 years (65kya) and encompasses a large number of major empires and events in Asian history.
MRCA stands for “Most Recent Common Ancestor”. When comparing two individuals, the MRCA is the most recent ancestor from which the two individuals descended.
With strong traces in Northern Europe, this group has made a great impact in Europe, even playing a large role in Viking ancestry.
DNA Haplogroup E is the most prominent group for individuals of African descent.
The majority of Y-DNA haplogroup L can be found within the Indian subcontinent, accounting for a large proportion of Indian Y-chromosomes.
Haplogroup O, defined by SNP marker M175, is thought to have appeared in East Asia approximately 35,000 years ago. Today, Haplogroup O can be detected across Asia and Oceania.
As research into the R subclades progresses at a rapid pace, the scientific community routinely renames existing subclades to accommodate the rapid growth of the Y-DNA phylogenetic tree.
Y-DNA STR markers available at Genebase and the corresponding motifs used for allele designation in Version 3.5.
Learn how to compare Y-DNA markers between 2 different individuals.
Learn about the steps are involved to obtain your Y-DNA haplotype.
Y-DNA Haplogroup J has strong Middle Eastern roots and has played a large part in shaping populations throughout Europe.
Commercial DNA testing laboratories follow different nomenclature for determining their marker values. The only accurate and reliable method to determine conversions required between different...
People whose ancestors are from the western coast of Europe often share in common a small group of Y-Chromosome STR markers. The group of Y-Chromosome markers which are frequently found in western...
It's the dominant group of Europe, playing one of the largest roles in shaping modern day European populations.
Y-DNA Haplogroup Q is widespread at low frequencies throughout the Middle East, Asia and Siberia, and at high frequencies in the Americas.
As research into the J subclades progresses at a rapid pace, the scientific community routinely renames existing subclades to accommodate the rapid growth of the Y-DNA phylogenetic tree.
Y-DNA STR markers mutate at a rate of approximately one mutation every 20 generations. The relatively rapid mutation rate of STR markers compared to the slow mutation rate of SNP markers makes STR...
A number of STR markers can be tested on the Y-DNA. The more markers that are tested, the more discriminating the matches when comparing to other individuals.