1. Background info: The story so far

Ancient Caucasoids (or ‘West Eurasians’) are the earliest recorded inhabitants of the Tarim Basin in northwestern China. Their burials are incredibly well-preserved due to environmental mummification caused by the region’s harsh climate. The earliest mummies are dated to around 4000 years ago and the latest to around 2000 years ago.

The earliest mummies were found to be “exclusively Caucasoid”, with stereotypically European features, including tall stature, high cheekbones, deep-set eyes, “fair hair, long nose[s], elongated skulls, high cranial vaults, etc” [Christopher P. Thornton]. From around 1000 BC onwards, some mummies in the Tarim’s eastern regions begin to show East Asian phenotypes, but most of the mummies are of West Eurasian origin.

It was once logically assumed that these Caucasoid mummies descended from two Indo-European populations: The Tocharian-speaking Afanasievo and Indo-Iranian-speaking Andronovo, who migrated into the Tarim Basin during the Bronze Age from the north and west, respectively. Both languages and cultures are unequivocally attested in the region.

Many Chinese sources dating to the Imperial Era describe people with Northern European phenotypes living in the northern, central, and western regions of what is now modern-day China:

“The Great Yuezhi are located about seven thousand li [2,910 km] north of India. […] The skin of the people there is reddish white.”
— The Western Regions, Wan Zhen (3rd century AD)

“Among the barbarians in the Western Regions, the look of the Wusun is the most unusual. The present barbarians who have green eyes and red hair, and look like macaque monkeys, are the offspring of this people.”
— Book of Han, Yan Shigu (1st Century AD)

Below is a macaque with light hair, light eyes, and rosy skin. Easy to see why the Chinese associated them with blond-bearded European barbarians.

In 2021, a groundbreaking genetic study unveiled that the earliest mummies of the Gumogou, Xiaohe, and Beifang cemeteries (dated 2100 to 1700 BC) were neither Tocharian nor Indo-Iranian, but a native population of Ancient North Eurasian descent. This debunked the theory that Indo-Europeans were native to the Tarim Basin, but proved beyond a shadow of a doubt that the Ancient North Eurasians had a Caucasoid phenotype that could easily be mistaken for European.

Below: Early Xiaohe mummies, are these the faces of the Ancient North Eurasians?

2. New study data

A new study on the Xinjiang region was released earlier this March. It features a handful of samples from five Tarim Basin sites, dating to the Iron Age (1000 BC to 1 BC) and Historical Era (1 AD to present). Although the study doesn’t feature any samples from the Bronze Age (when the demographic transition from Ancient North Eurasian to Indo-European occurred), it does prove that the Tarim Basin was settled en masse by Indo-Europeans and that Indo-European ancestry formed the “core component” of Tarim ancestry during the Iron Age and Historical Era.

Unfortunately, the study’s admixture analysis (below) is an incomprehensible mess. The samples aren’t labeled correctly and it uses an unnecessarily convoluted number of source populations that nest inside one another for no apparent reason. For example, “Xinj_LBA1” is 100% Andronovo, so its inclusion is tautological. At least the maps on the right provide a handy summary of Xinjiang’s demographic transitions.

The Tarim Basin was pretty heterogeneous during the Iron Age and Historical Era due to its leading role in Silk Road trade. Levels of East and West Eurasian ancestry varied wildly between and even within populations, likely a result of cosmopolitan immigration, traveling merchants, invasions, and so on.

Artistic depictions of Tarim inhabitants display a variety of Eurasian phenotypes during the Historical Era.

3. Simplified admixture analysis

The admixture analyses below were created using Vahaduo software and the study’s raw data. They are designed to provide a broad overview of Tarim genetics, so the number of source populations has been limited to the main ancestral components of the region:

  1. Tarim Ancient North Eurasian
  2. Indo-European (Afanasievo and Andronovo)
  3. BMAC (native Iranian farmers, not Indo-European)
  4. East Asian (Siberian and Chinese)
  5. Harappan or Indus Valley Civilization
  6. Ancient Ancestral South Indian

3.1 XSQG Xianshuiquangucheng and XKKD Xikakandasayi
Two individuals from Historic Era sites that fell within the proposed distribution of Tocharian languages, but are unlikely to represent actual ethnic Tocharians.
Average autosomal genetics: 10% Ancient North Eurasian (ANE), 30% Indo-European (IE), 15% BMAC, 40% East Asian (EA)

3.2. ZGLK Zaghunluq (Iron Age)
This site falls within the distribution of Tocharian C. It was home to many renowned Caucasoid mummies (most of which predate these samples), including the red-haired Cherchen Man. These samples may include ethnic Tocharians and Iranians.
Average autosomal genetics: 10% ANE, 35% IE, 10% BMAC, 40% EA

3.3. LSH Liushui (Iron Age)
This site falls within the ‘Hetian’ region of the Tarim Basin, which was home to Buddhist Scythians known as the Khotanese Sakas. This tiny sample size (2) may not be representative of the site.
Average autosomal genetics: 20% ANE, 30% IE, 25% BMAC, 25% EA

3.4. HET Hetian (Historic Era) and SPL Shanpula (Historic Era)
These two Khotanese sites shared cultural affinities with the ZGLK site. This homogeneous population is likely representative of this Khotanese subregion.
Average autosomal genetics: 5% ANE, 50% IE, 30% BMAC, 20% EA

3.5. JEZK Jirzankal (Iron Age)
Religious practices at this site resemble Zoroastrianism. It was likely inhabited by Eastern Iranians, e.g. Sakas/Scythians or Sogdians. Again, the homogeneous samples indicate a representative population.
Average autosomal genetics: 15% ANE, 35% IE, 25% BMAC, 20% EA

4. Conclusion

So, in summary, the Tarim Basin was a bustling trade hub that acted as a bridge between West and East Eurasia. The samples featured in this article were, on average:

  • 10% Ancient North Eurasian
  • 35% Indo-European
  • 20% BMAC (native Iranian but not Indo-Iranian)
  • 30% East Asian

No modern population has a similar set of ancestries to this average. The closest in terms of West vs East Eurasian ancestry (70:30) are Western Uralic and Turkic peoples, like Udmurts and Tatars.


The closest in terms of descending from similar ancestral populations are the East Iranian Tajiks, who are a 50/50 mix of Andronovo and BMAC, with some additional Indian and East Asian ancestry. They are phenotypically diverse and can look like any ethnicity from Northern Indian to Icelandic.


Hopefully we’ll see a study on the Tarim Bronze Age soon. It would be nice if they tested some of the more renowned mummies, like the Princess of Xiaohe. Then we could match ancestry proportions to phenotypes without relying on educated guesswork.