Someone asked me to do a summary of the genetics of India so I figured that I may as well throw in a couple of the surrounding countries too.
2. Indigeneous South Asians
3. Iranian Hunter-Gatherers
4. East Asians
5. European Pastoralists
6. Admixture Analysis
South Asia is one of the most heterogeneous regions on earth. It is a true “melting pot” where all of the world’s races have steadily converged over thousands of years, if not tens of thousands. As such, the region exhibits a wide array of phenotypic diversity. Some of its native inhabitants could even be mistaken for Sub-Saharan Africans, by the untrained eye.
Untangling the genetic history of South Asia sounds like a monumentally complex task but it can be summarized quite simply by breaking things down into their basic components. South Asians descend from a handful of ‘basal’ populations and a few major migrations, which are covered in detail below.
2. Indigeneous South Asians
India and the surrounding regions were originally inhabited by dark-skinned hunter-gatherers, named “Ancient Ancestral South Indians” (AASI) by geneticists. They were most similar to modern populations identified as ‘Australo-Melanesians’ by pre-21st-Century anthropologists, which include the “Negrito” Andamanese Islanders (e.g. Onge and Jarawa), Semangs and Bateks of Malaysia, Aetas of the Philippines, Vedda of Sri Lanka, and tribal populations of India, including the Munda, Gondi, and Irula. These groups have genetic affinities with Australian Aboriginals and Melanesians (hence the name).
The exact origin of this ‘Australo-Melanesian’ (or “South Eurasian”) population is currently unknown. Some scientists theorize that they belonged to an earlier Out of Africa expansion, distinct from that which gave rise to East and West Eurasians (i.e. East Asians and Caucasoids), while others believe that they represent an early ‘basal’ East Eurasian population that predates the East Asian “Mongoloid” phenotype.
“AASI” related ancestry forms the “base layer” of all South Asian genetics and is universal throughout the region. It can even be found in some populations of Southern East Asia, like the Yi, Naxi, and Tibetans. However, this ancestry peaks in South India, Sri Lanka, and among the aforementioned tribal populations.
“Negritos,” such as the Onge (below), can be genetically modeled as a mix of Papuan and East Asian related ancestries or as an early split from East Asians.
The Paniya are probably the best proxy for “AASI” genetics, as they have limited admixture from East and West Eurasians.
3. Iranian Hunter-Gatherers
Hunter-gatherers from ancient Iran made significant contributions to the South Asian gene pool. They migrated into the region during the Neolithic, around 5500 BC, intermixing with the native South Asian hunter-gatherers to varying degrees. This mixed population formed the Indus Valley or Harappan Civilization in northwest India, which may be the origin of the Dravidian languages.
Iranian hunter-gatherers closest living relatives are the Brahuis and Balochis of Iran, Afghanistan, and Pakistan, who trace almost half of their ancestry to this ancient population. They were genetically distinct from the later farming populations of Iran, who had additional ancestry from Anatolian Neolithic Farmers.
Although Iranian hunter-gatherer ancestry is common throughout South Asia, it is notably absent from some smaller, tribal populations, such as the Munda, as well as the East Asian peoples who inhabit the region’s peripheries.
4. East Asians
East Asian related ancestry arrived in South Asia during the Neolithic with the migration of Austroasiatic and Tibeto-Burmese speaking peoples from East and Southeast Asia respectively. Austroasiatic peoples were among the first East Asians to migrate into Southeast Asia, where they displaced the native Haobinhian hunter-gatherers, who were more closely related to the Onge than to modern Southeast Asians.
Recent genetic studies have found that the Sino-Tibetan language family likely originated in Northern China among the Yellow River millet farmers, while the Austric languages likely originated among the Yangtze River rice farmers of Southern China. These ancient populations were most similar to modern people from Northern China and Taiwan, respectively.
East Asian ancestry is mainly concentrated in the northern and eastern fringes of South Asia. The populations of Tibet, Burma, and Bhutan are almost entirely of East Asian descent, while Nepal is significantly more heterogeneous. East Asian ancestry is almost non-existent in India and Pakistan but can be found among the Austroasiatic-speaking tribal populations, like the Munda, who seem to be distributed quite randomly across the northeast of India.
Tibetan kids wearing fancy hats:
A Munda woman with subtle East Asian features:
5. European Pastoralists
The final major contribution to South Asian ancestry occurred during the Bronze Age. The Indus Valley Civilization began to decline around 1900 BC and by 1700 BC many Indus cities had been completely abandoned. Meanwhile, the Northern European population was exploding. European steppe pastoralists migrated into Central Asia from Eastern Europe around 2000 BC and expanded southwards via the Inner Asia Mountain Corridor. By 1500 BC, Europeans of the Andronovo Culture had entered into South Asia via the Hindu Kush and began to conquer the collapsing Indus Valley Civilization, gradually intermixing with its native inhabitants. They introduced Indo-European languages and culture to the region and recorded their conquest via the world’s oldest religious text, the Rigveda.
The above narrative is quite unpopular in India today (and in the “woke” West), as it conjures up recent memories of colonization and subjugation. Nevertheless, the European origin of the Indo-Aryans and the invasion itself was conclusively proven by genetic evidence in 2018. Leading Harvard geneticist David Reich stated that “the population that contributed genetic material to South Asia was (roughly) ~60% Yamnaya, ~30% European farmer-like ancestry, and ~10% Central Steppe hunter-gatherer ancestry” (which refers to West Siberian Hunter-Gatherers, a population similar to modern Udmurts). The closest living relatives of the early Indo-Aryans are Northern Europeans, including Finnic, Russian, and Scandinavian peoples.
Today, this Northern European component (inappropriately described by geneticists as “steppe ancestry”) accounts for ~0-30% of South Asian ancestry, peaking in the northwestern regions and among the upper classes of India.
Northern European phenotypes can be found among the mountain-dwelling Indo-Iranians of northern South Asia. They are not representative of the average Indo-Iranian phenotype (which is stereotypically “Middle Eastern”) and their tribes are often phenotypically diverse. The earliest Indo-Aryans may have looked similar, but not identical, to the people below.
“The Aryans in the Avesta are tall, light-skinned people with light hair; their women were light-eyed, with long, light tresses… In the Rigveda light skin alongside language is the main feature of the Aryans, differentiating them from the aboriginal Dáśa-Dasyu population who were a dark-skinned, small people speaking another language and who did not believe in the Vedic gods… Skin color was the basis of social division of the Vedic Aryans; their society was divided into social groups varṇa, literally ‘color’. The varṇas of Aryan priests (brāhmaṇa) and warriors (kṣatriyaḥ or rājanya) were opposed to the varṇas of the aboriginal Dáśa, called ‘black-skinned’”
— The Origin of the Indo-Iranians, Kuzmina (2007), p. 172.
The closest modern populations to Indo-Aryans from Uzbekistan, Tajikistan, and Kazakhstan:
6. Admixture Analysis
Below a simplified admixture analysis featuring various South Asian populations. Do not take this as gospel, it is simply a rough guide. The source populations are:
- Andronovo from the Fergana Valley in Uzbekistan
- Neolithic Iranian Hunter-Gatherers
- Paniya used as a proxy for indigeneous South Asian hunter-gatherers (AASI)
- West Siberian Hunter-Gatherers from the Tarim Basin
- Anatolian Neolithic Farmers
- Levantine Hunter-Gatherers
- Caucasus Hunter-Gatherers
- Three East Asian populations:
– Amur / Siberian
– Yellow River / North China
– Fujian / South China
- An Ancient Harappan Genome Lacks Ancestry from Steppe Pastoralists or Iranian Farmers
- The genetics of an early Neolithic pastoralist from the Zagros, Iran
- Genomic insights into the origin of farming in the ancient Near East
- Peopling History of the Tibetan Plateau and Multiple Waves of Admixture of Tibetans Inferred From Both Ancient and Modern Genome-Wide Data
- A genetic history of migration, diversification, and admixture in Asia
- Unravelling the distinct strains of Tharu ancestry
- East Asian Ancestry in India
- Extensive ethnolinguistic diversity in Vietnam reflects multiple sources of genetic diversity
- The Genomic Formation of South and Central Asia
- The genomic origins of the Bronze Age Tarim Basin mummies
Forgot to list the other studies I read, sorry.
There was supposedly the start of a Jewish migration around 8th century into the northwest Indian state of Gujarat, to escape Islamic persecution. Similar to the Jewish migrations into Europe around the same time. The Prime Minister, Home Minister and top 5 richest people in the country are of this descent. Additionally people of this race have a penchant to be diamond merchants, jewellers, traders and the sort. A disproportionate number of this tribesmen have also been known to mastermind stock market scams, loan defaults, insider trading, tax evasions and the usual financial gimmicks. They are supposed to be the defacto clan in charge of the South Asia region on behalf of the N.W.O. Just something I thought you’d like to know.
interesting, never heard of that. got any sources?
Search about the “sassoon family” also know as the “rotschilds of the east” (that alone says everything you need to know), they were the best representation of the chosen tribe in india
the height of their power was supposedly in the 18th century but im pretty sure they still hold a grip around indostan
oh yea I know about those guys but they came from Iran, not Indian natives
You are referring to the Patels. Unfortunately, they are not Semitic or of Jewish descent. Rather, they are a mixed Indo Aryan people, with AASI ancestry from the NW regions of South Asia (looks similar to indigenous natives of South America, identical even) and West Eurasian ancestry from Central Asia and the Steppes. Typically they are about 50-70% West Eurasian, and 30-50% AASI. Though the Brahmins, Muslim Bohra Patels and Parsis can range between 75-85% West Eurasian and 15-25% AASI, with Parsis being on the higher end around 85%, and Brahmins being around 80%-83% or so.
Good article as always Thuletide.
Got a question. Reich says in his book that “The ANI were a mixture of about 50 percent steppe ancestry related distantly to the Yamnaya, and 50 percent Iranian farmer–related ancestry from the groups the steppe people encountered as they expanded south.” Is it possible the mixing was not with IVC people, who carried some AASI, but with BMAC?
I have a feeling the 50-50 is an oversimplification because it doesn’t mention the Anatolian-related ancestry Proto-Indo-Iranians picked up before migrating, as Reich himself says in the quote you cite. Did he probably mean in his book that ANI was 50% Corded Ware and 50% Iranian-related?
I would personally like to see a similar article to this one on the history and origin of East and Southeast Asians (East Eurasians). I’ve made some progress but it’s a very murky area.
Also, can you make a post explaining how to read various statistics used in genetics studies, like f4? Thanks a lot for the great work.
no, the Indo-Aryans who migrated into South Asia didn’t have any BMAC ancestry, or if they did it was very small, like under 5%.
I think by “related distantly to the Yamnaya” he means Corded Ware but idk why he wouldn’t just say that, since the two groups are distinct because CW had 30% more EEF ancestry than Yamnaya
This study is the best summary of East Asian genetics so far: http://www.pivotscipub.com/hpgg/2/1/0001 I don’t agree with some minor details but it’s pretty good, especially from Neolithic onwards. The early eras are a mystery due to a lack of genetic samples, but it seems to me that Southeast Asia was initially sparsely populated by Australoid type people who came from an earlier OOA expansion, then the East Eurasians arrived later and partially displaced them and partially mixed with them. This created the Hoabinhian-type peoples. Then a different bunch of East Asians moved into Southeast Asia from East Asia and further displaced the Hoabinhains type people.
By the way, I think the calculated AASI percentage is way too high for these populations. According to this paper https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4760789/, Gujarati Brahmins are 88% ANI and only ~8% ASI, with the rest being Austro-Asiatic. Not to mention ASI itself was only 75% AASI, with the rest being Iranian-related.
So I think the 43% number you have is too high, and it’s probably closer to 6%, or at the very least under 10%.
tried a bunch of different models (like using Indus Valley instead of Iran_N and using Onge instead of Paniya) but all results were pretty similar with this level of AASI ancestry, +/-10%
Thanks for the reply. I agree; it’s probably an issue with the samples. The studies I cited may have had a smaller sample size.
LikeLiked by 1 person
By the way, I’m replying to this comment because for some reason I can’t reply directly to your latest comment, only like it.
LikeLiked by 2 people
>Also, can you make a post explaining how to read various statistics used in genetics studies, like f4
*meant article instead of post.
What do you think explains the discrepancy between the ASI percentages in the study I cited in my comment and your calculated AASI percentages? (Like I said, according to Reich, ANI is 50% Iranian-related and 50% Steppe-related.)
Also, I read in the famous “Genomic insights into the origin of farming in the ancient Near East” article that the “The demographic impact of steppe related populations on South Asia was substantial, as the Mala, a south Indian population with minimal ANI along the ‘Indian Cline’ of such ancestry is inferred to have ~18% steppe-related ancestry, while the Kalash of Pakistan are inferred to have ~50%, similar to present-day Northern Europeans.” The study wasn’t about that and the authors cite another famous study (Massive migration from the steppe was a source for Indo-European languages in Europe), but I couldn’t find that fact about the Kalash after a couple mins and I don’t have time to dig that deep, but I assume it has that piece of information somewhere.
What’s going on here? Why are the calculated estimates wildly different?
[Note: edited to merge your two posts]
No idea what’s going on. It could be sample selection, using different source populations for the model, and so on. Even when using the source populations from the Genomic Insights study I find that Mala has no steppe ancestry at all. India is a very diverse place so I’m guessing it’s because Global25 samples are different to the Reich Lab samples.
I always have a feel AASIs/South Eurasians aren’t a more basal form the Mongoloids emerged from, but a side branch from a population *both* emerged from. Might have looked like darker-skinned Ainu – I do see a resemblance to Aboriginals on the old pictures of them…
LikeLiked by 1 person
Good article. Are there any PCAs of other Homo species (Neanderthal, Denisovan, erectus) with ancient and modern populations of sapiens?
I have never seen one
I found this: https://www.researchgate.net/figure/PCA-analysis-of-North-African-Sub-Saharan-European-and-Asian-populations-Upper-right_fig5_232321487
I don’t know about neanderthals and denisovans being nearly equidistant from modern humans as chimps and it’s pretty old, there may be better ones but it’s what I could find.
>they didnt include europeans and asians in the PCA with chimps and africans
[Note: Comments merged into one reply]
Please look up Rahul Dev and Nikki Haley. What race do these two belong to? They are NW South Asians.
Your Steppe range is incorrect — the most Steppe folks in South Asia are the Rors of NW India— at 43% in the most Steppe cases found so far. In fact some research indicated even higher Steppe ancestry. Just search for the research paper published on Rors and Indus Valley Populations for the data.
Also, is is correct to state that the Indo Aryans that spread parts of Hinduism and Indo European languages to South Asia were completely distinct from modern day and ancient Iranians? I was reading an Iranian nationalist claiming that Iranians taught Indians everything because Aryans first went to Iran and then to India and that they are more Aryan than anyone else because of this. Does the genetic data bear this assertion out? As far as I know the Iranians are their own thing and have far less Steppe than NW South Asians, from a different source too I think. They also have far more Levant and Anatolian ancestry, in addition to unique West Asian ancestry that is completely absent in NW South Asians. They also possess far more East Asian admixture and only minor AASI admixture in most cases, along with minor SSA admixture in other cases.
Here are the links to the paper data: https://www.cell.com/cms/10.1016/j.ajhg.2018.10.022/attachment/711bfd84-99c1-44e6-8b40-fbde264460f6/mmc1
You can read the results of the paper by looking up the paper by title on Google/PubMed
Also can you please run the model on Rors, Haryana Jats, Rajasthani Jats, Khatris, Sindhis, West UP Jats, Hindu Jats, Aroras, Kohistani, Kho, Nuristani, Pashayi, Dameli, Burusho, Pathan, Arain, Pashtun_Tanoli, Pashtun_Kandahar and Chitrali please?
These are key NW South Asian populations missing from your analysis and who have the least AASI and the most ‘Aryan’ ancestry in South Asia as a result. They all range from 85-90% West Eurasian ancestry according to analyses Thank you!
>Rahul Dev and Nikki Haley
Both West Eurasian, very low South Eurasian ancestry
>the most Steppe folks in South Asia are the Rors of NW India
Yes you are correct looks like Rors are around 40%, +/- some %
>is correct to state that the Indo Aryans that spread parts of Hinduism and Indo European languages to South Asia were completely distinct from modern day and ancient Iranians
the Indo-Aryans were originally Northern European, so they were distinct from modern day Iranians but very similar to ancient Iranic people who migrated into Iran.
>Iranian nationalist claiming that Iranians taught Indians everything because Aryans first went to Iran and then to India
This is not true, Indo-Aryans arrived first in India and the Middle East (Mitanni Kingdom). Aryans arrived in Iran many centuries later.
>and that they are more Aryan than anyone else because of this. Does the genetic data bear this assertion out?
Yes this is not true in a genetic sense. You are correct that they have less Aryan ancestry than Indians.
>Also can you please run the model on [x,y,z]
I don’t have access to samples for all of these populations, but I can run the model on some of them. Results:
Thanks for the models. It seems the AASI is a little high for some of these populations, but that is an artifact of the fact that the Paniya_AASI component is around 30-33% West Eurasian, so that needs to be subtracted from the total score to get the actual AASI admixture. Also, since AASI as we know it currently sits on the Western Boundary of East Eurasian proper in cluster maps, it appears to have some substantial West Eurasian admixture within it yet to be delineated. You would also be interested to know that there are multiple streams of AASI, and the AASI in the NW lack a component present in the Southern strain, that models best when a Hohabinhian source is added to the South AASI source. This indicates that NW South Asians are purely East Eurasian admixed, and that Southern Dravidians in most cases might possess some additional real Australasian admixture not present in all other South Asians. I would wait for more ancient autosomal DNA from the Bronze age and earlier periods in South and Central Asia.
BTW, would you mind linking the source for where you got the collage of photos of the phenotypes above? Specifically the collage of Nurstanis, Kalash, Pamiris, etc? There appears to be a labeling error with some of the pictures, often certain Tajik individuals are passed off as Kalash and/or Nuristani and certain Tajiks and Indo Aryans and Pamiris are mistakenly labeled as Pashtun, which appears to be the case here as well. BTW you might be surprised to learn that many of these Indo Aryan and Indo Iranian phenotypes still exist through the NW South Asia region, albeit uncommon even among Indo Aryan tribes like the Kalash and the Ror.
the photos are from traveler albums on Flickr, I can correct the mistakes if you tell me them
Thank you for replying, would you mind linking the exact Flickr album? I can then ID these images with proof. Thanks.