Learn about Y-DNA Haplogroup O

Alternative content

The ads below are provided by Google.
Publications
by Genebase Users

What is subclade testing?

Subclade testing can provide increased resolution of your placement on the Y-chromosome phylogenetic tree.  Before your subclade can be determined, you must first know what haplogroup you belong to.  Haplogroups are defined by unique mutation events such as single nucleotide polymorphisms, or SNPs.  These SNPs mark the branch of a haplogroup, and indicate that all descendents of that haplogroup at one time shared a common ancestor.  The Y-DNA SNP mutations were passed from father to son over thousands of years.  Over time, additional SNPs occur within a haplogroup, leading to new lineages.  These new lineages are considered subclades of the haplogroup.  Each time a new mutation occurs there is a new branch in the haplogroup, and therefore a new subclade.  By testing for the presence of SNPs that are known to be indicative of particular subclades, you can now determine the specific subclade you belong to within your haplogroup.

Origin of Y-DNA Haplogroup O

Haplogroup O, one of the 20 haplogroups found on the Y-chromosome tree (Figure 1), is thought to have appeared in East Asia approximately 35,000 years ago.  This haplogroup shares a node in the Y-chromosomal phylogenetic tree with Haplogroup N, which is common in North Eurasia.  The man carrying the SNP M175 was likely part of a migrating tribe whose progress was blocked by high mountain ranges; some of the tribe was forced north (leading to Haplogroup N), whereas another group, including the ancestor of Haplogroup O, continued east (Figure 2).  The migrants continued east across the southern part of Siberia and eventually crossed into Asia.  Today, Haplogroup O can be detected across Asia and Oceania (Table 1; Figure 3), and is prevalent in 80-90% of men in East and Southeast Asia (Su et al. 1999; Tajima et al. 2002; Jin et al. 2003; Hammer et al. 2006).  It is, however, almost nonexistent in Siberia, West Asia, and Europe, and is completely absent in Africa (Cruciani et al. 2008) and the Americas (Lell et al. 2002).


Figure 1.  The phylogenetic tree of the 20 known Y-DNA haplogroups.  Haplogroup
O is circled in blue to indicate its relative position within the tree. 


Figure 2.  Proposed migration path of Haplogroup O ancestors from Y-chromosomal Adam in Africa, carrying SNP mutation M91, to the unique mutation event causing SNP M175 that defines Haplogroup O. 



Figure 3.  Worldwide frequency distribution of Haplogroup O.  The red area within each pie chart indicates the frequency of Haplogroup O within that location.  The labels and associated pie charts also indicate the average frequency of Haplogroup O within different language families of China.  It is clear from this frequency distribution map that Haplogroup O is most prevalent within East and Southeast Asia, with moderate frequencies detected in men from Central Asia and Oceania. 


Table 1.  Summary of the frequency of Haplogroup O in various populations across the world.

Region

Country / Language Group

Frequency

Number

Reference

Central Asia

Mongolia/Buryatia

0.173

226

Sahoo et al 2006

 

Regional average

0.205

419

Hammer et al 2006

 

Regional average

0.083

496

Hammer et al 2006

 

Regional average

0.149

209

Hammer et al 2006

 

Regional average

0.115

165

Sahoo et al 2006

East Asia

Altaic

0.125

303

Su et al 1999

 

Ancient remains along Yangtze River

0.625

48

Li et al 2007

 

Austroasiatic

0.777

140

Li 2005

 

Austroasiatic

0.933

30

Li et al 2008a

 

Austronesian

0.671

381

Li 2005

 

Cambodia

0.800

125

Black et al 2006

 

China

0.833

36

Kayser et al 2003

 

China (North)

0.707

113

Jin et al 2003

 

China (South)

0.641

39

Jin et al 2003

 

China (South)

0.795

80

Capelli et al 2001

 

China (Subgroup North)

0.375

16

Deng et al 2004

 

China (Subgroup South)

0.780

23

Deng et al 2004

 

China (Subgroup Tibet)

0.700

20

Deng et al 2004

 

China (Subgroup West)

0.620

13

Deng et al 2004

 

China, Mongolia, Korea and Japan

0.560

988

Xue et al 2006

 

Daic

0.809

1465

Li 2005

 

Daic

0.897

926

Li et al 2008a

 

Hmong-Mien

0.844

934

Feng 2007

 

Japan

0.517

259

Hammer et al 2006

 

Japan

0.546

108

Jin et al 2003

 

Japan

0.545

263

Nonaka et al 2007

 

Korea

0.744

160

Jin et al 2003

 

Korea

0.640

25

Kayser et al 2003

 

Mongolia

0.241

100

Jin et al 2003

 

Northeast

0.323

441

Hammer et al 2006

 

Northeast

0.305

754

Karafet et al 2001

 

Sino-Tibetan

0.606

281

Su et al 1999

 

Taiwan

0.870

246

Capelli et al 2001

 

Regional average

0.491

175

Sengupta et al 2006

India

Regional average

0.122

1074

Sahoo et al 2006

 

Regional average

0.229

728

Sengupta et al 2006

Middle East

Pakistan

0.023

176

Sengupta et al 2006

 

Turkey

0.002

523

Sahoo et al 2006

Oceania

Andaman Islands

0.300

10

Thangaraj et al 2003

 

Australia (Arnhem)

0.000

60

Kayser et al 2003

 

Australia (Desert)

0.030

35

Kayser et al 2003

 

Melanesia

0.060

342

Capelli et al 2001

 

Melanesia

0.088

400

Kayser et al 2006

 

Melanesia

0.075

673

Scheinfeldt et al 2006

 

Nicobar Islands

1.000

11

Thangaraj et al 2003

 

Nusa Tenggara

0.258

31

Kayser et al 2003

 

Papua New Guinea (coast)

0.097

31

Kayser et al 2003

 

Papua New Guinea (highlands)

0.032

31

Kayser et al 2003

 

Papua New Guinea (Tolai New British)

0.063

16

Kayser et al 2003

 

Polynesia

0.255

200

Capelli et al 2001

 

Polynesia

0.279

441

Kayser et al 2006

 

Solomon Islands

0.281

32

Cox & Lahr 2006

 

Solomon Islands (Java)

0.886

53

Kayser et al 2003

 

Solomon Islands (Malaita Province)

0.000

12

Cox & Lahr 2006

 

Solomon Islands (Western Province)

0.450

20

Cox & Lahr 2006

 

Trobriand Islands

0.377

53

Kayser et al 2003

 

Vanuatu

0.043

234

Cox & Lahr 2006

 

West New Guinea (highlands)

0.011

94

Kayser et al 2003

 

West New Guinea (lowlands/coast)

0.000

89

Kayser et al 2003

Siberia

Regional average

0.247

44

Lell et al 2002

Southeast Asia

Indonesia

0.862

36

Jin et al 2003

 

Indonesia (Bali)

0.837

551

Karafet et al 2005

 

Indonesia (Borneo)

0.750

40

Kayser et al 2003

 

Indonesia (Moluccas)

0.177

34

Kayser et al 2003

 

Island Southeast Asia

0.833

333

Li et al 2008a

 

Malaysia

0.667

18

Kayser et al 2003

 

Philippines

0.970

28

Capelli et al 2001

 

Philippines

0.818

77

Jin et al 2003

 

Philippines

0.821

39

Kayser et al 2003

 

Taiwan Aborigines

1.000

43

Kayser et al 2003

 

Taiwan Aborigines

0.967

220

Li et al 2008a

 

Taiwan Chinese

0.923

26

Kayser et al 2003

 

Thailand

0.855

55

Jin et al 2003

 

Vietnam

0.740

50

Jin et al 2003

 

Vietnam

0.910

11

Kayser et al 2003

 

Regional average

0.566

312

Capelli et al 2001

 

Regional average

0.734

683

Hammer et al 2006

 

Regional average

0.791

503

Karafet et al 2001

 

Regional average

0.625

272

Kayser et al 2006

 

Regional average

0.851

289

Sahoo et al 2006

1. Distribution of Haplogroup O in East Asia

Haplogroup O is very common in East Asian men, ranging from a low of 12.5% (Su et al. 1999) to a high of 93.3% (Li et al 2008a) depending on the specific population or country (with an overall average of 64.3%; see references in Table 1).  This haplogroup has played an important role in advancing understanding of the evolutionary history of human populations, particularly by clarifying knowledge of the population history of East Asia.  Although genetic studies since the 1980s have supported the hypothesis that all extant human populations are derived from populations that have migrated out of Africa (the “Out-of-Africa” hypothesis), the discovery of many hominid fossils in East Asia led archaeologists to question the validity of this theory.  Y-chromsome analyses, in combination with other genetic studies, helped to further support the conclusion that all present day humans are derived from African populations (see Jin & Su 2000 for review).  East Asia in particular has played an important role in reconstructing the migration history of humans because this region, including the countries China, Mongolia, North and South Korea, Japan and Taiwan, is thought to be the point of origin for subsequent migrations into Siberia and the Americas.

Modern populations within East Asia may have been derived from northern expansions of southern populations during the Last Glacial Maximum (LGM) 18,000 to 21,000 years ago (Figure 4; Chu et al. 1998; Jin & Su 2000; Su et al. 1999) or from male contribution from Central Asia via the Silk Road (Ding et al. 2000; Karafet et al. 2001).  Further data suggest that northern populations expanded prior to the LGM (22,000 to 34,000 years ago) whereas the southern populations delayed expansion until after the LGM but then expanded much more rapidly than the northern populations (Xue et al. 2006).  It is possible that the northern populations were able to exploit the megafauna of the “Mammoth Steppe” whereas the southern populations had to wait for warmer and more stable climates before they were able to access more abundant food resources to support their population expansion.

Population genetic studies of Hainan Island provided some resolution to the historical migration patterns of humans to East Asia.  Hainan Island was connected to the East Asian mainland during the last Ice Age, and was therefore directly in the path of human migrations from Southeast Asia to East Asia.  It is thought, therefore, that Hainan aboriginals are direct ancestors of the original migrants into East Asia.  Haplogroup O was found at a high frequency in Hainan men, and reached 100% in one of the aboriginal populations (the Gei; Li et al. 2008a).


Figure 4.  This figure illustrates the assumed migration path of humans into East Asia (reconstructed from figures in Jin and Su 2005 and Li et al 2008a).  The thicker red lines indicate the movement of tribes out of Africa into Southeast Asia sometime between 18,000 and 60,000 years ago.  This migration was followed by a northward expansion into East Asia and southward expansion into Malayasia, Indonesia, and the islands of Oceania.  The blue arrows indicate more recent genetic admixture from Central Asia.

Haplogroup O in China

China is of particular interest to reconstructing the spread of humans throughout the world due to its large size, historical importance, and geographical location at the split of the northern and southern routes of human migration (Cavalli-Sforza 1998; Underhill et al. 2001).  Haplogroup O is the dominant haplogroup within the Chinese population (approximately 65%, n = 76; Deng et al. 2004, for example); consequently, the paternal gene pool of the Chinese population has played a large role in reconstructing the male demographic history of East Asia.  Within China, there seems to be a genetic distinction between populations from the north and south, with the Yangtze River acting as the boundary between the two regions (Xiao et al. 2000). 
 
China is exceptionally diverse and represents over 200 languages that belong to one of seven language groups (Altaic, Austroasiatic, Austronesian, Daic, Hmong-Mien, Sino-Tibetan, and Indo-European).  Approximately 93% of Chinese belong to the Han ethnic majority with the remaining 7% of the non-Han Chinese population belonging to one of 55 recognized minorities (Cavalli-Sforza 1998).  Haplogroup O is dominant in both Han and non-Han Chinese, although is found at a higher prevelance within the Han Chinese (Underhill et al. 2000).  Although the non-Han Chinese are currently a minority, this was not always the case, and in fact some non-Han groups may have been important sources of migrants during East Asian migration waves.  For example, Hun and Turks settled in North China; Mongolian and Manchu groups from North and Southeast China have dominated China empires; Di-Qiang were important nomads in West China; and Miao (with Han) dispersed to South China.

Genetic data from Deng et al. (2004) suggested that there are four genetically distinct Chinese minority subgroups: 1) Subgroup North; 2)Subgroup Tibet; 3) Subgroup West; and 4) Subgroup South.  Subgroup North consists of both nomadic groups and hunters and gatherers in the Mongolian highlands.  The four main ethnic subfamilies within this subgroup are Mongol, Manchu, Yugur and Korean (all likely belonging to the Altaic language family).  Haplogroup O has been detected in 37.5% of men in this subgroup (n = 16), and is the only one of the four minority groups in which Haplogroup O is not found in the majority of men tested.  Subgroup Tibet (70% Haplogroup O, n = 20) includes ten populations around the Tibetan plateau.  The high-altitude pastoral lifestyle of the Tibetans likely led to limited genetic admixture with neighbouring groups.  Genetic data indicates that one population within this subgroup was formed by a Di-Qiang branch that left their Yellow River homeland in West/North China approximately 5,000 to 6,000 years ago and migrated west then south to Central Tibet.  Another branch of Di-Qiang apparently migrated southward in East Tibet to form the Qiang of Northern Sichuan.  Subgroup West populations (six Muslim groups from West China) are distributed along the Silk Road.  The Hui are part of this subgroup and have a high incidence of Haplogroup O (62%, n = 13).  The Hui can trace their lineage to traders and artisans from Central/West Asia that migrated to China in the 13th century with the Mongol armies.  Subgroup South includes ten populations living in the agriculturally productive lowlands and river valleys in South China.  It is believed that this subgroup was derived from migrants from North and West China that carried the SNP M175; Haplogroup O is found in 78% of men from this population (n = 23).

Haplogroup O in Japan and Korea

During the Last Glacial Maximum, the Korean Peninsula and Japanese Archipelago were connected.  Archeological evidence suggests that humans were present in Korea between 25,000 and 45,000 years ago.  Previous studies using protein and nuclear DNA markers have indicated that Koreans are most similar to Mongolians whereas mitochondrial DNA analysis suggests that Chinese populations were the likely ancestors of Korean people.  A recent Y-chromosome analysis detected a strong prevalence of Haplogroup O (74.4% in Korea; Jin et al. 2003) and supports previous observations that there were likely multiple migration events into Korea, with a major genetic contribution of north- and eastward expansion of Chinese populations (Jin et al. 2003; Karafet et al. 2001).  It is believed that Chinese populations fled to the Korean Peninsula as refuge during the Warring Period (476-221 BC), a time of political chaos in China.

As with most Asian regions, Japan consists of high cultural and genetic variation represented both in present day populations and in archeaological evidence.  Analysis of this variation has suggested that there were two major migrations that brought people to Japan.  The first migration is estimated to have occurred over 30,000 years ago, likely from Central Asia, and eventually led to the Jomon culture.  People of the Jomon culture spent several thousand years in isolation as hunter-gatherers and producers of the some of the earliest pottery before a more recent migration of people from the Korean Peninsula approximately 2,300 years ago.  These recent migrants represented the Yayoi culture and they brought with them wet rice agriculture, weaving and metalworking.  Two distinct groups currently found in Japan, the Ainu (Hokkaido) and the Ryukyuans (south islands), are thought to be remnants of the Jomon culture.  On average, Haplogroup O is quite prevalent within Japanese men (estimates range from 51.7% to 54.5%; Hammer et al. 2006, Nonaka et al. 2007) yet it was not detected in the Ainu.  Y-chromosome analysis, including Haplogroup O and its subclades, indicated that the current diversity in Japan likely reflects varying degrees of admixture between the Jomon and Yayoi ancestors (Hammer et al. 2006). 


2. Distribution of Haplogroup O in Southeast Asia

There seems to be general agreement that Southeast Asian populations are genetically distinct from those of East Asia (Chu et al. 1998, Su et al. 1999, 2000, Ding et al. 2000, Jin & Su 2000, Capelli et al. 2001, Karafet et al. 2001), but Haplogroup O is still found at high frequencies within this region; estimates range from 17.7% to 100% depending on the country or population (Table 1; Kayser et al 2003).  Southeast Asia (including Laos, Vietnam, Thailand, Cambodia, Malaysia, Singapore, Brunei, Indonesia, East Timor, and the Philippines) may have been the first settlement of humans into eastern Asia (Su et al. 1999) from Central Asia; this theory is based on the observation that Southeast Asian populations seems more genetically diverse then those from East Asia (Mongolia, China, Korea, Japan and Taiwan).  Not all studies agree with this theory, however, and suggest that the genetic divergence may be due to isolation by distance (Ding et al. 2000) or that there may not even be significant genetic divergence between East and Southeast Asia (Karafet et al. 2001).  Inconsistencies reflected by the previous two papers may have arisen by not considering the effect of recent gene flow.  Jin and Su (2000) provide a comprehensive summary of the data indicating that Southeast Asia was the genetic source of two independent migrations: one north into Taiwan and East Asia, and one south into Polynesia (Figure 4).

Haplogroup O in Indonesia

The people of Indonesia, and in particular the island of Bali, represent varying genetic contributions from pre-Neolithic hunter-gatherers, Austronesian farmers, and Indian traders.  A land bridge once connected the Indonesian islands to the Asian mainland, and the island of Bali is a remaining stepping-stone of this land bridge.  Archaeologists provide evidence for the presence of humans on Bali as far back as the Pleistocene, and argue that Austronesian-speaking migrants arrived in Indonesia from southern China and Taiwan approximately 4,500 to 3,000 years ago and replaced the aboriginal hunter-gatherers.  To add further complexity to the population history, Bali shows evidence of extensive contact with Indian populations since at least 2,000 years ago.  Genetic population analysis of Balinese people provided some insight to the controversy surrounding the origin of Austronesian ancestors.  Haplogroup O was detected in 83.7% of Balinese men (n=551); phylogeographic analysis provided evidence that Haplogroup O was brought to Bali with the Austronesian expansion from Southeast Asia (Karafet et al. 2005). 

Haplogroup O in the Andaman and Nicobar Islands

The Andaman Islands are located in the Indian Ocean, just south of Burma.  Although numbering only a few dozen, there are still native Andamanese remaining on these islands.  The Andamanese, sometimes referred to as “Negritos”, remained isolated from the world until colonialism during the 19th and 20th centuries led to a collapse of the small-sized hunter-gatherers to the few threatened populations that remain today.  Although the Andamanese are morphologically quite similar to African pygmies, studies (utilizing blood groups and protein analysis) suggest that they are derived from Australo-Melanesian ancestors of Southeast Asia and Oceania.  Nicobarese populations, located on the Nicobar Islands south of the Andaman Islands, are thought to resemble mainland Southeast Asian populations.  Y-chromosome analysis found that 30% of the Andamanese and 100% of the Nicobarese were Haplogroup O (Thangaraj et al. 2003).  Results of this study, in consideration with mtDNA analysis, indicates that the Andamanese may represent an anciant Asian population that remained uninfluenced by Neolithic agriculturalists.  As such, the Andamanese language may be one of the last examples of pre-Neolithic Southeast Asia that was not affected by the spread of the Austronesian language family during the Neolithic period. 

3. Distribution of Haplogroup O in Oceania

Oceania is a large area that can be further refined into “Near Oceania” and “Remote Oceania”.  The Solomon Islands archipelago, in addition to Australia and New Guinea, form part of “Near Oceania” and may have been settled since the Pleistocene, approximately 45,000 years before present (O’Connell and Allen 2004).  Nevertheless, the Austronesian language group is the predominant language currently spoken in the Solomon Islands and comparison with Austronesian languages throughout the Indo-Pacific region suggest that a major population expansion occurred during the Halocene (less than 10,000 years before present) from Mainland Asia.  Haplogroup O, and other markers of East Asian origin, tend to be associated with Austronesian-speaking groups and are rarely found in non-Austronesian-speaking groups (Mona et al. 2007, for example).  One exception to this trend was the presence of Haplogroup O as one of five lineages detected within a group of 32 men from the Solomon Islands and although nearly half the men within the Western Province of the Solomon Islands carried the M175 SNP, indicating that they were part of Haplogroup O, these men spoke a Papuan language (Cox and Lahr 2006).  Although this observation may have been skewed by the small sample size (only 32 men), the authors made an important point that genetic studies at the level of the community may provide a different story than studies conducted at higher scales of analysis.  Further support for the validity of using Haplogroup O to trace the Austronesian expansion was gained from a population genetic study of New Guinea.  West New Guinea remained isolated from the Austronesian expansion during the Neolithic, and comparisons of West New Guinea to Papua New Guinea indicated the presence of Haplogroup O in Papua New Guinea, but not West New Guinea.  In addition, the proportion of Haplogroup O was higher in the coastal and lowland regions of Papua New Guinea, collaborating quite well with the assumed route of Austronesian expansion (Kayser et al. 2003). 

Remote Oceania includes countries of Polynesia, Micronesia, and the Melanesian islands.  Polynesia is a region defined by Hawaii in the north, Easter Islands in the east, Fiji in the west, and New Zealand in the south.  Analysis of language groups suggests that Polynesians originated from Asia (since Polynesian languages belong to the Austronesian family that originated in East Asia) whereas archaeological evidence points to Melanesia as the point of origin of ancestral Polynesians.  The low frequency of Haplogroup O within Polynesia is interpreted to indicate only a small contribution from Southeast Asia to their paternal history (Capelli et al. 2001, Underhill et al. 2001).  Genetic drift, however, may have skewed the results.  Further studies have indicated a major presence of Haplogroup O within Tonga and French Polynesia (Scheinfeldt et al. 2006), and have detected the haplogroup within Oceanic-speaking groups.  It is likely that Polynesia has a dual genetic origin with a male-dominated contribution from Melanesia.  Kayser et al. (2006) found that 65.8% of men had a Melanesian genetic contribution, whereas 28.3% could be traced to Asian origin (of which 27.9% were Haplogroup O).


4. Distribution of Haplogroup O in India

Haplogroup O was one of eight haplogroups detected in an Indian population at frequencies > 5% (overall, 22.9% with 14.6% Subclade O2a and 8.0% Subclade O3a3c; Sengupta et al. 2006).  A relatively high proportion of Haplogroup O was detected across all tribal linguistic classes (Austroasiatic, Dravidian, Indo-European, and Tibeto-Burman) but the haplogroup was rare within caste populations, supporting theories that caste and tribal populations within India had separate origins (Cordaux et al. 2004).  The Austroasiatic language family has a high prevalence in Southeast Asia, and it is thought to be one of the oldest language families in India.  These two observations suggest that there may be a linkage between Indian and Southwest Asian Austroasiatics.  Based on current distributions of Haplogroup O, Austroasiatic speakers in India likely originated from Southeast Asia, but other results indicate that the demographic history may not be this simple.  More recent studies argue that Austroasiatic populations originated in India, and then migrated to Southeast Asia via the Northeast Indian corridor (Kumar et al. 2007).


5. Distribution of Haplogroup O in Anatolia

The gene pool of the Anatolian Peninsula, also known as Asia Minor, contains historical records of gene flow, admixture and population differentiation throughout the distribution of humans across the world, mostly due to its geographical location as a link between the Middle East, Asia and Europe.  A low presence of the Asian specific haplogroup O3-M122 in Turkey (0.19%; Cinnoglu et al. 2004) provides some historical record of the influx of Turkic speakers from Seljuk, and Osmanli groups from Central Asia, to Anatolia.  A large population within Anatolia (up to 12 million during the late Roman period) would have limited the cultural influence of these immigrants; these data have contributed to the conclusion that Anatolia was both an important buffer to the homogenization of genetically and culturally distinct populations and a source of gene flow.  


The Subclades of Y-DNA Haplogroup O

Haplogroup O is one of the more diverse Y-DNA haplogroups and consists of 32 unique lineages, or subclades.  These subclades can be grouped into one of the three major subclades within Haplogroup O (aside from the paragroup O*): Subclade O1, Subclade O2 and Subclade O3.  Subclade O1 consists of five deeper subclades whereas Subclade O2 has seven; Subclade O3 is by far the more complex with a total of 19 identified subclades.  Refer to the “Phylogenetic Tree” section below for a representation of the phylogenetic tree of Haplogroup O, indicating the relationship among the numerous subclades.  The table below (Table 2) provides a summary of what is currently known about each of the subclades.  This table will be updated as more information is accumulated.
 

Table 2.  A summary of the information currently known about each of the subclades within Haplogroup O.  For simplicity, all paragroups were omitted from the table since information about paragroups is often difficult to differentiate from the founding subclade from which it was derived.

Subclade

Age (Mean)

Age (Min)

Age (Max)

Main Distribution

References

O1

 

 

 

East Asian coast

Su et al 1999; Li 2005; Zhang et al 2007

O1a

33,765

28,544

38,986

Common in Austronesians, southern Han Chinese, and Tai-Kadai

Karafet et al 2005; Kayser et al 2003; Li et al 2008a,b

 

 

 

 

Very common in Taiwan (near 100%)

Capelli et al. 2001; Kayser et al. 2003

O1a1

 

 

 

Specific data for this subclade is not available

 

O1a1a

 

 

 

Specific data for this subclade is not available

 

O1a2

3,420

2,245

5,570

Not found in China; restricted to Austronesian-speaking men of Southeast Asia and Micronesia

Su et al 1999; Su et al 2000; Karafet et al 2005

O2

 

 

 

Common in Southeast Asia and Hainan Island in East Asia

Kayser et al 2003; Li et al 2008a

O2a

11,700

10,100

13,300

Common in Austro-Asiatic, Tai-Kadai, Malay, and Indonesian groups

Kayser et al 2003; Kumar et al 2007; Li et al 2008b; Sengupta et al 2006

 

 

 

 

Detected at moderate frequencies in South Asia, Southeast Asia, East Asia, and Central Asia

Sengupta et al 2006

O2a1

 

 

 

Frequent among Hani, She, Tai, Cambodian, and Vietnamese populations

Hammer et al 2006; Nonaka et al 2007

 

 

 

 

Detected at moderate frequencies in Qiang, Yi, Hlai, Miao, Yao, Taiwanese aborigines, and Han Chinese of Sichuan, Guangxi, and Guangdong

Li et al 2008b

O2a1a

 

 

 

Found at a low frequency among Pashtuns of Pakistan

Firaset et al 2007

O2a2

 

 

 

Specific data for this subclade is not available

 

O2b

2,700

1,100

7,100

Detected in Ryukyuan, Japanese, Indonesian, Vietnamese, Thai, Manchu, Evenk, and Micronesian populations

Hammer et al 2006; Jin et al 2003

O2b1

8,160

3,810

12,270

Common in Japanese and Ryukyuan

Hammer et al 2006

 

 

 

 

Detected in Indonesians, Thais, Koreans, and Vietnamese

Hammer et al 2006

 

 

 

 

Yayoi founding lineage

Hammer et al 2006


Subclade

Age (Mean)

Age (Min)

Age (Max)

Main Distribution

References

O3

19,300

16,000

24,000

May be associated with the spread of rice farming

Su et al 1999; Karafet et al 2005; Scheinfeldt et al 2006

 

 

 

 

Common throughout Asia and Austronesian regions of Oceania

Karafet et al 2005

 

 

 

 

Detected at moderate frequencies in Central Asia

Kayser et al 2003

O3a

29,816

21,053

38,579

 

Shi et al 2005

O3a1

 

 

 

Detected at low frequencies in Austroasiatic populations of Southeast Asia and Han Chinese

Su et al 2005

O3a2

 

 

 

Detected at low frequencies in Austroasiatic of Southeast Asia 

Su et al 2005

O3a3

 

 

 

Detected at moderate frequency in Japan

Nonaka et al 2007

O3a3a

 

 

 

Detected at low frequencies in East Asia

Li et al 2008a; Xue et al 2006

O3a3b

28,317

36,759

19,875

Typical of Hmong-Mien groups, with a moderate distribution among Han Chinese, Buyei, Qiang, and Oroqen

Shi et al 2005; Xue et al 2006

 

 

 

 

Detected from ancient remains along the Yangtze River

Li et al 2007

O3a3b1

 

 

 

Specific data for this subclade is not available

 

O3a3b1a

 

 

 

Specific data for this subclade is not available

 

O3a3b1b

 

 

 

Specific data for this subclade is not available

 

O3a3b2

 

 

 

Specific data for this subclade is not available

 

O3a3c

17,278

6,500

33,799

Typical of Sino-Tibetan populations; distribution throughout East Asia and Southeast Asia

Sengupta et al 2006; Shi et al 2005

O3a3c1

29,807

22,217

37,398

Specific data for this subclade is not available

Shi et al 2005

O3a3c1a

 

 

 

Detected within Austroasiatic and Tibeto-Burman language groups in Southeast and East Asia

Su et al 2005

O3a3c2

 

 

 

Specific data for this subclade is not available

 

O3a4

 

 

 

Detected at low frequencies in Japan

Nonaka et al 2007

O3a4a

 

 

 

Detected at low frequencies in Japan

Nonaka et al 2007

O3a5

 

 

 

Specific data for this subclade is not available

 

O3a6

 

 

 

Specific data for this subclade is not available

 




How are the Subclades of Y-DNA Haplogroup O determined?

1. Obtain a Y-DNA haplogroup predication based on the results from a Y-DNA STR test.
2. Confirm your haplogroup with a Y-DNA Haplogroup Backbone SNP test.  You should be positive for M175, the SNP that is used to confirm Haplogroup O in the Y-DNA Haplogroup Backbone SNP Test panel.  You may also be positive for M122, the SNP that identifies Subclade O3 within Haplogroup O.
3. Once your haplogroup has been confirmed as O, you can then obtain the Y-DNA Haplogroup O Subclade Test.  If you possess SNP M175 only (and are negative for M122) then you will require the O1 and O2 Panel.  If you are positive for both M175 and M122, you will require the O3 Panel.  The tables below provide a list of the 12 SNP markers used in O1 and O2 Panel (Table 3) or the 16 markers used in the O3 Panel (Table 4), including the location of the mutation, details about the specific mutation, and the subclade that it defines.
4. Identify the location of your SNPs on the phylogenetic tree to determine your subclade.  Figure 5 provides a diagram to guide you through the process of locating your subclade.

Table 3.  List of the SNP markers used in the Y-DNA Haplogroup O Subclade O1 and O2 Test.

Location of SNP

SNP

Haplogroup O subclade

M119

A > C

O1a

P203

G > A

O1a1

M101

C > T

O1a1a

M103

C > T

O1a2

P31

T > C

O2

M95

C > T

O2a

M88

A > G

O2a1

PK4

A > T

O2a1a

M297

A > G

O2a2

M176

C > T

O2b

P49

A > T

O2b

47z

G > C

O2b1


Table 4.  List of the SNP markers used in the Y-DNA Haplogroup O Subclade O3 Test.

Location of SNP

SNP

Haplogroup O subclade

M324

G > C

O3a

M121

AGAAA del

O3a1

M164

T > C

O3a2

P201

T > C

O3a3

M159

A > C

O3a3a

M7

C > G

O3a3b

M113

A > G

O3a3b1

P164

T > C

O3a3b2

M134

G del

O3a3c

M117

AGAT del

O3a3c1

M162

C > C/T

O3a3c1a

P101

G > A

O3a3c2

IMS-JST002611

C > T

O3a4

P103

G > C

O3a4a

M300

G > A

O3a5

M333

G ins

O3a6


 


Figure 5a.  This decision tree illustrates the process of determining your subclade, if you are part of Subclade O1 or O2 (therefore, you do not possess the M122 SNP).  The blue boxes represent the SNPs that are tested in the Y-DNA Haplogroup O Subclade Test Panel for Subclades O1 and O2.  The red boxes indicate the specific subclades.  To determine your subclade, simply refer to your test results and follow the decision path until you end up at a red box; that is your subclade.  For Subclade O3, refer to Figure 5b.

Figure 5b.  This decision tree illustrates the process of determining your subclade, if you are part of Subclade O3 (therefore, have the M122 SNP).  The blue boxes represent the SNPs that are tested in the Y-DNA Haplogroup O Subclade Test Panel for Subclade O3 (refer to Figure 5a for Subclades O1 and O2).  The red boxes indicate the specific subclades.  To determine your subclade, simply refer to your test results and follow the decision path until you end up at a red box; that is your subclade. 


Geographical Distribution of the Subclades of Y-DNA Haplogroup O

The following section provides details about the distribution of the specific subclades within Haplogroup O.  Some subclades have only recently been detected and therefore little or no information is available for them.  As data rapidly accumulate, there will no doubt be additional information to add to this summary.  The first step to understanding the distribution of the subclades is to take a broad approach when exploring the four main subclades that have been tested in population genetic studies: O*, O1a, O2 and O3.  Figure 6 illustrates the relative proportion of these main subclades across East Asia, Southeast Asia and Oceania, and a summary of the mean frequencies of these four subclades throughout different countries can be found in Table 1.  For a deeper analysis of the subclades, Figure 7 presents a summary of the relative proportion of each subclade in Siberia, Central Asia, East Asia (including individual charts for China, Korea, Japan, Taiwan, and Hainan Island), Southeast Asia, and Oceania.  These two figures (Figures 6 and 7) will provide a useful reference as you read through the descriptions of each of the subclades below.


Figure 6.  Relative frequency distribution of the four main subclades of Haplogroup O that have commonly been detected in Y-chromosomal population genetic studies.  The frequencies of the subclades in each country are listed in Table 4.


Table 4. Summary of the average frequencies of the four main subclades detected in Haplogroup O.  The values represent the frequency of each subclade detected in the sample population, not the relative frequencies.  Relative frequencies are illustrated in Figure 6.

Region

Country

Subclade

Frequency

Reference

Central Asia

 

O*

0.040

Hammer et al 2006; Karafet et al 2001; Sahoo et al 2006

 

 

O1a

0.012

Hammer et al 2006; Karafet et al 2001

 

 

O2

0.005

Hammer et al 2006; Karafet et al 2001; Sahoo et al 2006

 

 

O3

0.133

Hammer et al 2006; Karafet et al 2001; Sahoo et al 2006

East Asia

Area Average

O*

0.062

Hammer et al 2006; Karafet et al 2001; Kayser et al 2003; Li et al 2008b; Sengupta et al 2006; Xue et al 2006

 

 

O1a

0.157

Hammer et al 2006; Karafet et al 2001; Li et al 2008b; Sengupta et al 2006; Xue et al 2006

 

 

O2

0.177

Hammer et al 2006; Karafet et al 2001; Li et al 2008b; Sengupta et al 2006; Xue et al 2006

 

 

O3

0.277

Li et al 2008b; Su et al 2005; Hammer et al 2006

 

China

O*

0.001

Capelli et al 2001; Kayser et al 2003

 

 

O1a

0.438

Capelli et al 2001; Karafet et al 2005; Kayser et al 2003; Li et al 2007

 

 

O2

0.151

Capelli et al 2001; Karafet et al 2005; Kayser et al 2003; Li et al 2007

 

 

O3

0.297

Capelli et al 2001; Karafet et al 2001, 2005; Kayser et al 2003; Li et al 2007; Sengupta et al 2006, Su et al 2005, Xue et al 2006

 

Hainan Island

O1a

0.316

Li et al 2008a

 

 

O2

0.592

Li et al 2008a

 

 

O3

0.068

Li et al 2008a

 

Japan

O*

0.000

Hammer et al 2006; Nonaka et al 2007

 

 

O1a

0.017

Hammer et al 2006; Nonaka et al 2007

 

 

O2

0.330

Hammer et al 2006; Nonaka et al 2007

 

 

O3

0.183

Hammer et al 2006; Nonaka et al 2007

 

Korea

O*

0.320

Kayser et al 2003

 

 

O1a

0.036

Kayser et al 2003; Kim et al 2006

 

 

O2

0.142

Kayser et al 2003; Kim et al 2006

 

 

O3

0.320

Kayser et al 2003; Kim et al 2006

 

Taiwan

O*

0.038

Capelli et al 2001; Kayser et al 2003

 

 

O1a

0.612

Capelli et al 2001; Kayser et al 2003; Li et al 2008b

 

 

O2

0.055

Capelli et al 2001; Kayser et al 2003; Li et al 2008b

 

 

O3

0.226

Capelli et al 2001; Kayser et al 2003; Li et al 2008b

India

Area Average

O*

0.002

Kumar et al 2007; Sahoo et al 2006

 

 

O2

0.255

Kumar et al 2007; Sahoo et al 2006; Sengupta et al 2006

 

 

O3

0.048

Kumar et al 2007; Sahoo et al 2006; Sengupta et al 2006

Middle East

Pakistan

O2

0.024

Firasat et al 2007; Moyhuddin et al 2006

 

 

O3

0.013

Firasat et al 2007; Sengupta et al 2006

 

Turkey

O*

0.002

Sahoo et al 2006

 

 

O2

0.000

Sahoo et al 2006

 

 

O3

0.001

Cinnoglu et al 2004; Sahoo et al 2006

Near Oceania

 

O*

0.000

Kayser et al 2003

 

 

O1a

0.006

Kayser et al 2003

 

 

O2

0.000

Kayser et al 2003

 

 

O3

0.023

Kayser et al 2003

Remote Oceania

 

O*

0.004

Capelli et al 2001; Hammer et al 2006; Kayser et al 2003; Scheinfeldt et al 2006

 

 

O1a

0.090

Capelli et al 2001; Hammer et al 2006; Karafet et al 2005, Kayser et al 2003; Scheinfeldt et al 2006

 

 

O2

0.050

Capelli et al 2001; Hammer et al 2006; Karafet et al 2005; Kayser et al 2003

 

 

O3

0.098

Capelli et al 2001; Hammer et al 2006; Karafet et al 2005; Kayser et al 2003; Scheinfeldt et al 2006

Siberia

 

O1a

0.181

Lell et al 2002

South Asia

 

O*

0.000

Hammer et al 2006

 

 

O1a

0.000

Hammer et al 2006; Karafet et al 2005

 

 

O2

0.079

Hammer et al 2006; Karafet et al 2005

 

 

O3

0.003

Hammer et al 2006; Karafet et al 2005

Southeast Asia

Area Average

O*

0.068

Capelli et al 2001; Hammer et al 2006; Karafet et al 2001; Kayser et al 2003; Li et al 2008b; Sahoo et al 2006

 

 

O1a

0.182

Capelli et al 2001; Hammer et al 2006; Karafet et al 2001, 2005; Kayser et al 2003; Li et al 2008b

 

 

O2

0.227

Capelli et al 2001; Hammer et al 2006; Karafet et al 2001, 2005; Kayser et al 2003; Li et al 2008b; Sahoo et al 2006

 

 

O3

0.271

Capelli et al 2001; Hammer et al 2006; Karafet et al 2001, 2005; Kayser et al 2003; Li et al 2008b; Sahoo et al 2006

 

Indonesia

O*

0.025

Kayser et al 2003

 

 

O1a

0.155

Hammer et al 2003; Karafet et al 2005

 

 

O2

0.590

Karafet et al 2005

 

 

O3

0.064

Hammer et al 2003; Karafet et al 2005; Kayser et al 2003

 

Malaysia

O*

0.000

Kayser et al 2003

Need to cite this tutorial in your essay, paper or website? Use the following format:

Learn about Y-DNA Haplogroup O. Genebase Tutorials. Retrieved November 25, 2014, from http://www.genebase.com/learning/article/24
Test your DNA markers today!
Get Test »
  • DNA tests starting from only $119
  • Search for immediate family lines
  • Receive instant match notifications when new matches are found
The ads below are provided by Google.
Other Tutorials
The Y-DNA SNP Haplogroup Backbone Test Panel contains 19 SNP markers throughout the Y-DNA. These 19 SNP markers are the defining markers for an individual’s Y-DNA haplogroup.
Your Y-DNA haplotype is the specific set of results obtained after testing a set of STR markers on your Y-DNA.
The Y-DNA Test examines several different STR Marker Types.
Find out what's new in Version 2 of the I Subclade Test Panel.
As the research in I subclades progresses, the scientific community routinely renames existing subclades to accommodate rapid growth of the Y-DNA phylogenetic tree.
Learn how Y-DNA Haplogroup G helped shape present day Middle Eastern societies and how it plays a significant role in the peopling of modern day India.
Individuals who have taken the Haplogroup R Subclade test may benefit from selectively testing newly discovered SNPs that are relevant to their particular subclade.
Discover the different types of genetic markers found in the Y-DNA and how it allows us to trace our paternal lineage.
Dates of discovery for SNPs that define subclades downstream of R1b (M343+) are listed.
Unlike all of the other chromosomes, the Y-Chromosome is unique because it is passed down relatively unchanged along the male lineage and thus holds valuable information about a male’s ancestry.
DYS464 is an unique Y-DNA STR marker which is known to have 4 to 7 alleles (a to d for 4 or a to g for 7).
Our discussion will cover human history that dates back more than 65,000 years (65kya) and encompasses a large number of major empires and events in Asian history.
MRCA stands for “Most Recent Common Ancestor”. When comparing two individuals, the MRCA is the most recent ancestor from which the two individuals descended.
With strong traces in Northern Europe, this group has made a great impact in Europe, even playing a large role in Viking ancestry.
DNA Haplogroup E is the most prominent group for individuals of African descent.
The majority of Y-DNA haplogroup L can be found within the Indian subcontinent, accounting for a large proportion of Indian Y-chromosomes.
Haplogroup O, defined by SNP marker M175, is thought to have appeared in East Asia approximately 35,000 years ago. Today, Haplogroup O can be detected across Asia and Oceania.
As research into the R subclades progresses at a rapid pace, the scientific community routinely renames existing subclades to accommodate the rapid growth of the Y-DNA phylogenetic tree.
Y-DNA STR markers available at Genebase and the corresponding motifs used for allele designation in Version 3.5.
Learn how to compare Y-DNA markers between 2 different individuals.
Learn about the steps are involved to obtain your Y-DNA haplotype.
Y-DNA Haplogroup J has strong Middle Eastern roots and has played a large part in shaping populations throughout Europe.
Commercial DNA testing laboratories follow different nomenclature for determining their marker values. The only accurate and reliable method to determine conversions required between different...
People whose ancestors are from the western coast of Europe often share in common a small group of Y-Chromosome STR markers. The group of Y-Chromosome markers which are frequently found in western...
It's the dominant group of Europe, playing one of the largest roles in shaping modern day European populations.
Y-DNA Haplogroup Q is widespread at low frequencies throughout the Middle East, Asia and Siberia, and at high frequencies in the Americas.
As research into the J subclades progresses at a rapid pace, the scientific community routinely renames existing subclades to accommodate the rapid growth of the Y-DNA phylogenetic tree.
Y-DNA STR markers mutate at a rate of approximately one mutation every 20 generations. The relatively rapid mutation rate of STR markers compared to the slow mutation rate of SNP markers makes STR...
A number of STR markers can be tested on the Y-DNA. The more markers that are tested, the more discriminating the matches when comparing to other individuals.