06 October 2010

New Language Discovered

Linguists announced that they had discovered a language spoken by about a thousand people that had never before been included in lists of world languages before called Koro.

Who Are The Koro Speakers?

Koro has been regarded very different from rest of the Tibeto-Burman languages as it does not match any other from the family. Aka, the main language of that area is also very rare. Koro makes the world’s count of known languages to 6909. . . . The speakers of the language are a subtribe to the larger 10,000-person Aka tribe.

Koro is spoken in Northeast India, specifically, the Thrinzo area in the West Kameng and East Kameng districts of the Arunachal Pradesh state of India, which are in the Northwest part of that state near the Chinese border that China claims as part of the Tibet Autonomous Region within its boundaries. A 2005 research report cited in Ethnologue suggested that it may be spoken only in East Kameng and was discovered before the 2008 expedition by National Geographic, while National Geographic reports that it is also spoken in West Kameng. These two regions combined have about 130,000 people, of whom the 10,000 members of the Aka tribe that includes the Koro language speaking people are one of many minority populations.

Koro had not previously been discovered because outsides had assumed that its speakers spoke Aka, the language of their neighbors with whom they share a tribal affilation and ethnic self-identity, whose speakers are similar culturally, but not particularly close linguistically, to Koro speakers. The Aka language is also known as the Hruso language, in part, to prevent confusion with an African language of the same name. A map locating them is found on page 179 of "The Sino-Tibetan languages" By Robbins Burling in a book edited by Graham Thurgood, Randy J. LaPolla (2003) by Psychology Press, which devoted Chapter 11 to "The Tibeto-Burman languages of Northeastern India" (page 169-191).

Their Aka tribe is one of at least eight minority tribes in the area, and unlike the majority tribe in the district called the Monpa, that makes up 78% of the local population and is Buddhist, "the Aka, Khowa, and Miji have indigenous religions and those tribe members follow a mix of Buddhism, Hinduism, and Donyi-Polo (a form of Animism)." The group's lack of full assimilation to Buddhism and Hinduism, whose origins can be traced with reasonable specificity in history, and probably post-date Donyi-Polo practice, may help explain why they were no assimilated to the extent that they would lose their language.

The fact that the areas is agriculturally marginal, with local populations engaging in slash and burn agriculture supplemented by hunting, and by considerable gathering of local wild plants, also probably helps explain the limited amount of outside influence that they have experienced.

The area was historically shifted in control between the Mon kingdoms, specifically, the Thaton Kingdom (9th. century–1057), the Hanthawaddy Kingdom (1287–1539), and the Restored Hanthawaddy Kingdom (1740–1757), that ruled much of modern day Burma, Tibet, and the Ahom kingdom (1228-1826) in parts of present-day Assam, which is now one of the states of Northeast India, with rule by local Aka and Nishi chiefs when other kingdoms didn't rule the area. Since 1950, the area has also been home to many Tibetan refugees from China, which make since because Tibetan Buddhism has had a strong foothold in the area since the 7th century. So, the fact that people speaking only distantly related languages could live together in the same cluster of villages isn't as surprising as it would otherwise seem.

Koro Is An Endangered Language

Koro's days as a living language may be numbered, however, because "the language has not been written down and most of the speakers are over the age of 20." (English script is being used for the current transcriptions.) There are few native young native speakers of the language because the region is finally starting to come into closer contact with and intensive rule by people from the larger world. As the Wikipedia article on Arunachal Pradesh explains;

[T]he Indo-European languages Assamese, Bengali, English, Nepali and especially Hindi are making strong inroads into Arunachal Pradesh. Primarily as a result of the primary education system - in which classes are generally taught by Hindi-speaking immigrant teachers from Bihar and other Hindi-speaking parts of northern India - a large and growing section of the population now speaks a semi-creolized variety of Hindi as its mother tongue.

Why Does The Discovery Of Koro Matter?

The people who brought the existence of Koro to the world's attention, funded by National Geographic, obviously think this is a big deal. Those involved have written a book called The Last Speakers: The Quest To Save The World's Most Endangered Languages, and while saving them is probably an unrealistic goal since most are moribund and their are powerful social and economic pressures for the young to abandon these languages, the task of at least documenting endangered languages before they are gone forever is one where the resources available are modest compared to the scale of the task. Probably half of all languages spoken today will no longer be spoken by native speakers a century from now.

In the field of linguistics, outlier languages can recast the likelihood of competing theories about the histories of whole language families. This discovery is unlikely to shake up the already muddled family tree of Tibeto-Burman languages.

Indeed, without detracting from the good work being done in this case, it is fair to say that linguistics is a fairly mature field academically. A lot of the broad, easy conclusions that can be reached from the study of languages alone to classify them and understand the broad outline of their origins have already been reached. Even when new languages are found, it is rare that they contain grammatical features that aren't found in many other, often closely related languages. There are isolated places in the world like Paraguay, Papua New Guinea and the highlands of South Asia, where new languages are still being discovered in large numbers and there are blank spots in our understanding.

But, it is fair to say that almost all languages that are spoken by large numbers of people (probably all spoken by tens of thousands of people or more) have been discovered and described to some extent, and that the vast majority of undiscovered languages are moribund. Many of the blank spots that remain in linguistic efforts to classify languages will probably always have question marks attached to them, because the languages died or were so heavily influenced by neighboring languages before they were committed to writing or documented by outsiders that there will never been enough information available to definitively classify them.

Likewise, one of the biggest unanswered questions in linguistics today is to determine what deeper connections exist between the language families whose classification as families can be established with confidence. But, since those connections lie almost entirely before the development of writing, and languages can diverge from each other or prior versions of the same language, to the point of being almost unintelligible, through ordinary day to day language evolution, over time frames on the order of 400-1000 years, discerning the deep history of linguistic origins from the oldest available written sources, the oldest recorded oral accounts, and contemporary speakers of languages can reach back only so far based on linguistic evidence alone. Even in cases where we know from history that two different languages had common origins and have documentation of the proto-language (e.g. Sanskrit and Latin), it would often be very challenging to say with any confidence that those languages were related in the absence of that intermediate evidence.

This isn't to say that no progress is being made in the field. Developments in neuroscience and population genetics are allowing us to bring new evidence to bear on the big unresolved debates in linguistics. It is easier now than it once was to conclusively determine whether people who speak different languages (or different groups of people who speak the same language) share common ancestors. While neither linguistics nor genetics can offer definitive dating of migrations and key events in pre-history, and the reliability and comprehensiveness of older histories can leave something to be desired, all of these disciplines do provide insights on the timing of these events and together with archaeology can help us to develop a synthesis that is less speculative than prior efforts.

In the field of linguistics, the discovery of Koro may make more of a difference in our understanding of the conditions necessary for a language to survive, for by all rights, under conventional linguistic assumptions, one would have expected it to have died out as a language long ago. The Koro experience may also shed light into the fact that Tibeto-Burman language speaking populations in India have been profoundly more resistant to language shift than other populations in India, even compared to other "tribal" populations that are not Tibeto-Burman language speaking from the same provinces.

Could Koro Provide Insights Into India's Pre-Invasionist Languages?

Another possible contribution of Koro to linguistic knowledge is that its distinctive vocabulary, compared to other Tibeto-Burman languages, could indicate the unusually strong influence of a South Asian substrate language, or unusually pronounced loan word borrowing from such a language (depending upon whether the Koro people were indigenous or were migrants into the area), or perhaps an otherwise now extinct Tibeto-Burman substract language, that has been all but obliterated elsewhere in India through language shifts to Indo-Aryan, Dravidian, Austroasiatic, and Tibeto-Burman languages in the last 5,500 years or so.

Where Tibeto-Burman languages are similar to each other, or to another invasionist language, there is good reason to assume that words and grammatic construtions are not indigenous. But, where they are distinctive, it is possible that they reflect the influence of a now lost language. This would be particularly true if the distinctive core seems to follow any notable pattern, as opposed to the mere culmination of localized random linguistic evolution.

Tibeto-Burman languages are well enough studied that it ought to be possible to extract the lexical and grammatical contributions of Tibeto-Burman languages to Koro with some confidence to reveal the linguistic core of Koro that makes it distinct from other Tibetan-Burman languages.

There is evidence whose age is established by Toba volcano ash of human populations in India more than 74,000 years ago. But, all four of India's major language groups have far more recent origins.

Austroasiatic and Tibeto-Burman languages are almost universally believed to be invasionist to India from the East in late Neolithic era (probably after 3500 BCE giving the timing of the invention of the agricultural crops associated with Austroasiatic populations and the genetic evidence that most Northeast Asian men can trace their paternal genetic roots to East Asia within the last four thousand years). Even Out of India proponents who see the Harappan culture as the source of the proto-Indo-European languages would agree that these languages have arrived in the rest of India sometime after 2500 BCE.

Likewise, there are good indications that the Dravidian languages had a unified origin in Central to Southern India sometime after 3500 BCE coincident with the appearance of agriculture there (long after it arose in the Indus River Valley). It is very unlikely, given the relative unity of the Dravidian languages today and the degree of linguistic division in societies that had hunter-gatherer societies in Australia, Papua New Guinea and the Americas that were encountered by Europeans prior to developing agriculture, that the Dravidian was spoken widely across South Asia prior to the development of agriculture, even if the languages spoken in South Asia at that time may have had some general commonalities with proto-Dravidian.

Yet, other than one or two tiny isolated groups, one of which has a moribund language that was recently discovered in the Himalayas and shows affinities to the peoples of the Andaman Islands, all traces of pre-invasionist languages in South Asia have vanished. The distinctive core of Koro could be one of most coherent pieces of evidence of a lost linguistic substrate in Northeast India that still exists.

Less optimistically, researchers think that it "may have originated from a group of people enslaved and brought to the area."

Genetic evidence relevant to Northeast India

Where analysis of ancient DNA suggests that one population replaced another, that can inform our speculation about whether there was a change in language in that place as well. Genetic analysis of remains find in archaeological digs can help us to date when genetic change has taken place, and can link the people who lived at the time and place whose material culture has been unearthed to their surviving ancestors today. This, in turn, can inform our guesses about the non-material culture of the people who lived there. Data points on the conditions where languages do and do not survive as relicts can inform our judgment in cases where we don't have direct evidence.

An example of this kind of synthesis of genetic and linguistic evidence involving Tibeto-Burman languages is a 2004 study by Bo Wen, et al. entitled "Analyses of Genetic Structure of Tibeto-Burman Populations Reveals Sex-Biased Admixture in Southern Tibeto-Burmans" which notes that:

[L]ittle is known about sex-biased admixture in East Asia, where substantial migrations are recorded. Tibeto-Burman (TB) populations were historically derived from ancient tribes of northwestern China and subsequently moved to the south, where they admixed with the southern natives during the past 2,600 years. They are currently extensively distributed in China and Southeast Asia.

In this study, we analyze the variations of 965 Y chromosomes and 754 mtDNAs in >20 TB populations from China. By examining the haplotype group distributions of Y chromosome and mtDNA markers and their principal components, we show that the genetic structure of the extant southern Tibeto-Burman (STB) populations were primarily formed by two parental groups: northern immigrants and native southerners. Furthermore, the admixture has a bias between male and female lineages, with a stronger influence of northern immigrants on the male lineages (62%) and with the southern natives contributing more extensively to the female lineages (56%) in the extant STBs. This is the first genetic evidence revealing sex-biased admixture in STB populations, which has genetic, historical, and anthropological implications.

Another study has looked at the mtDNA distributions of Chinese ethnic populations including Tibetans, which unlike many populations in China do not show signs of ancient population expansions, perhaps due to episodes of extreme population size reductions in the past.

A Y chromosome DNA study from 2000 in "Human Genetics" entitled "Y chromosome haplotypes reveal prehistorical migrations to the Himalayas" by Bing Su, Peter Underhill, Luca Cavalli-Sforza, et al. showed "a strong genetic affinity" among all Sino-Tibetan populations, implying a strong genetic affinity among populations in the same language family, and found evidence of a strong bottleneck effect "in the Sino-Tibetan speaking populations in the Himalayas including Tibet and northeast India . . . that occurred during a westward and then southward migration of the founding population of Tibeto-Burmans. We, therefore, postulate that the ancient people, who lived in the upper-middle Yellow River basin about 10,000 years ago and developed one of the earliest Neolithic cultures in East Asia, were the ancestors of modern Sino-Tibetan populations." Of course, those kinds of findings are particularly unlikely to apply to small outlier groups in linguistically diverse areas on the fringe of the Tibeto-Burman language speaking area like the Koro, where language acquisition for the group, or language retention when neighboring groups acquired new languages, is a plausible hypothesis.

This is atypicality is supported by studies like "Ethnic India: A Genomic View, With Special Reference to Peopling and Structure" (2003) in Genome Research by Analabha Basu, et al., which reports that:

(1) there is an underlying unity of female lineages in India, indicating that the initial number of female settlers may have been small; (2) the tribal and the caste populations are highly differentiated; (3) the Austro-Asiatic tribals are the earliest settlers in India, providing support to one anthropological hypothesis while refuting some others; (4) a major wave of humans entered India through the northeast; (5) the Tibeto-Burman tribals share considerable genetic commonalities with the Austro-Asiatic tribals, supporting the hypothesis that they may have shared a common habitat in southern China, but the two groups of tribals can be differentiated on the basis of Y-chromosomal haplotypes; . . . (9) historical gene flow into India has contributed to a considerable obliteration of genetic histories of contemporary populations so that there is at present no clear congruence of genetic and geographical or sociocultural affinities.

But, a mtDNA study of tribal populations of India published in the same year noted that:

Within India, northeastern tribes are quite distinct from other groups; they are more closely related to east Asians than to other Indians. This is consistent with linguistic evidence in that these populations speak Tibeto-Burman languages of east Asian origin.

Much of the population Northeast India according to a 2004 population genetic study, may have relatively shallow roots, within the past four thousand years:

The northeast Indian passageway connecting the Indian subcontinent to East/Southeast Asia is thought to have been a major corridor for human migrations. Because it is also an important linguistic contact zone, it is predicted that northeast India has witnessed extensive population interactions, thus, leading to high genetic diversity within groups and heterogeneity among groups. To test this prediction, we analyzed 14 biallelic and five short tandem–repeat Y-chromosome markers and hypervariable region 1 mtDNA sequence variation in 192 northeast Indians.

We find that both northeast Indian Y chromosomes and mtDNAs consistently show strikingly high homogeneity among groups and strong affinities to East Asian groups. We detect virtually no Y-chromosome and mtDNA admixture between northeast and other Indian groups. Northeast Indian groups are also characterized by a greatly reduced Y-chromosome diversity, which contrasts with extensive mtDNA diversity.

This is best explained by a male founder effect during the colonization of northeast India that is estimated to have occurred within the past 4,000 years. Thus, contrary to the prediction, these results provide strong evidence for a genetic discontinuity between northeast Indian groups and other Indian groups. We, therefore, conclude that the northeast Indian passageway acted as a geographic barrier rather than as a corridor for human migrations between the Indian subcontinent and East/Southeast Asia, at least within the past millennia and possibly for several tens of thousand years, as suggested by the overall distinctiveness of the Indian and East Asian Y chromosome and mtDNA gene pools.

Als notable is a 2009 study that compared the genetics of the Tibeto-Burman language speaking populations of Northeast India to those from Tibet and Nepal:

The Himalayan mountain range has played a dual role in shaping the genetic landscape of the region by (1) delineating east–west migrations including the Silk Road and (2) restricting human dispersals, especially from the Indian subcontinent into the Tibetan plateau. In this study, 15 hypervariable autosomal STR loci were employed to evaluate the genetic relationships of three populations from Nepal (Kathmandu, Newar and Tamang) and a general collection from Tibet. These Himalayan groups were compared to geographically targeted worldwide populations as well as Tibeto-Burman (TB) speaking groups from Northeast India.

Our results suggest a Northeast Asian origin for the Himalayan populations with subsequent gene flow from South Asia into the Kathmandu valley and the Newar population, corroborating a previous Y-chromosome study.

In contrast, Tamang and Tibet exhibit limited genetic contributions from South Asia, possibly due to the orographic obstacle presented by the Himalayan massif. The TB groups from Northeast India are genetically distinct compared to their counterparts from the Himalayas probably resulting from prolonged isolation and/or founder effects.

Given the religious, legendary history, and cultural indications that the Aka tribe and other tribal populations of the vincinty may have roots in the most ancient strata of Northeast India, it would be particularly interesting to see if they fit the general trend of Northeast India, or are distinct.

A natural next step in the research for an anthropologist would be to determine if Koro speakers are genetically distinct from other members of the Aka tribe, and if they have stronger genetic affinities to other Tibeto-Burmese peoples, a discovery that might shed light on the question of which languages Koro is most closely related to and what in the pre-history of the Koro speaking people brought them to their current geographical and tribal niche.

Classifying Koro Within The Tibeto-Burmese languages

Koro has been tentatively classified as belonging to a sub-branch of the Tibeto-Burman languages (no source I could locate made clear which one) which include approximately 350 languages and also either includes or is the language family most closely related to the Chinese languages. Burmese has the most speakers (approximately 32 million), assuming the exclusion of Chinese. Approximately 8 million Tibetans and related peoples speak one of several related Tibetan languages.

An NPR report, described in Wikipedia suggested that "although it has resemblances to Tani further to the east, it appears to be a separate branch of Tibeto-Burman."

The non-Chinese branch of the Tibeto-Burman language family is full of obscure languages spoken in only isolated parts of the Tibetan, Indian and Burmese highlands grouped into five to seven main subfamilies, often with some additional small language isolates or unclassified small language subfamilies.

Most of the Tibeto-Burman languages in the area belong to (1) the Tani branch of the Tibeto-Burmese languages, whose languages spoken by about 600,000 people mostly in about half of the Arunachal Pradesh state which lies to the South and East of them, (2) the Mishmi languages further to the East from the Tani language speakers, or (3) the Bodic languages (i.e. the Tibetan-Burmese language branch that includes the Tibetan language) which are spoken to the North and West of the Koko speakers.

The languages of the minority tribes around this part of the Kameng River are:

almost completely undescribed and unclassified languages, which, speculatively considered to be Tibeto-Burman, exhibit many unique structural and lexical properties that probably reflect both a long history in the region and a complex history of language contact with neighbouring populations. Among them are Sherdukpen, Bugun, Aka/Hruso, Miji, Bangru and Puroik/Sulung. The high linguistic significance of all of these languages is belied by the extreme paucity of documentation and description of them, even in view of their highly endangered status. Puroik, in particular, is perhaps one of the most culturally and linguistically unique and significant populations in all of Asia from proto-historical and anthropological-linguistic perspectives[.]

Burling states (at page 180 in the book cited above) that recent vocabularies (from the 1988-1991) make clear that Sherdukpen (possibly the same language as "But-pa"), Bugun, and the language of the Sulung people who call themselves "Puroik" clearly form a group and that they are remarkably divergent from other Tibeto-Burman languages although there are enough congnate in the three groups to make their affilation as Tibeto-Burman languages secure. The Sulung are regarded as the original inhabitants of the state and have apparently been severely oppressed by all of the wave of migrants who have followed them.

The languages of the Aka tribe (other than Koro) and its fellow language family branch are called Hruso [Aka] aka Hruso B, and Dhammai/Miji aka Hruso A, by Burling (at page 180), who states that they are spoken to the North and East of Sherdukpen and Bugun, and are very different although still sufficiently similar to be grouped together as a sub-branch of the Tibeto-Burman languages. He also notes that a language very similar to Dhammai called Bangru/Levai by linguist Tianshin Jackson Sun and spoken further north along the Tibetan border is in the group. These languages were grouped in a Northern geographical area, but by Burling as a geographical classification only. (Burling also notes that the area was off limits to researchers from the mid-1970s to the mid-1990s due to violence, some of it political, which explains the timing of the new research in the area.)

Aka/Hruso, Miji, Bangru

A 1997 classification of these languages by David Bradley breaks them out as follows:

I. Western (= Bodic)
*A. Tibetan/Kanauri
*B. Himalayan
II. Sal
*A. Baric (Bodo-Garo–Northern Naga)
*B. Jinghpaw
*C. Luish (incl. Pyu)
*D. Kuki-Chin (incl. Meithei and Karbi)
III. Central (perhaps a residual group, not actually related to each other. Lepcha may also fit here.)
*A. Adi-Galo-Mishing-Nishi
*B. Mishmi (Digarish and Keman)
*C. Rawang
IV. North-Eastern
*A. Qiangic
*B. Naxi–Bai
*C. Tujia
*D. Tangut
V. South-Eastern
*A. Burmese-Lolo (incl. Mru)
*B. Karen

You can be forgiven for not recognizing most of these languages. Only one, Burmese is associated with an independent nation-state (Myanmar). Tibetan is the other best known language of the group, and Tibet, of course, spent much of the period from the 600s as an independent country, or as a tributary state or highly autonomous province of one its neighbors until the Chinese government insisted on taking a more active hand in ruling Tibet in 1959. There are about 200,000 ethnic Tibetans outside China, predominantly in India, Nepal and Bhutan, although the vast majority of Tibetans live in the Tibetan Autonomous Region (something of a misdescription of its level of autononmy) in China.

Another proposed family tree for the Sino-Tibetan languages via Wikipedia

The discovery of Koro may tweak the taxonomy debates in these language groups a little as it becomes better understood, but probably doesn't fundamentally recast the general outlines of the prevailing academic theories, themselves tentative and with few conclusions having consensus support, about the linguistic histories of the Tibeto-Burmese languages and the mass migrations of people that put its speakers where they are today, and speculates about who spoke which languages in the past in what places.

Other Details

The research was done in 2008.

The findings on the Koro language are to be published in [volume 71 of] the journal Indian Linguistics, authored by [David] Harrison [of Swarthmore College], [Gregory] Anderson and Ganesh Murmu of Ranchi University in India. . . . Harrison said it was hard to pinpoint how often linguists identify a “new” language. It is an uncommon occurrence, though somewhat less so in the language family that includes Koro, called Tibeto-Burman. Linguists have identified about a dozen previously unknown tongues in that family during the last 30 years, said Scott DeLancey, a University of Oregon professor of linguistics. . . .The people who speak Koro typically speak other languages as well, such as Hindi, another uncommon language called Aka, and even English, Harrison said. . . . Anderson, director of the nonprofit Living Tongues Institute for Endangered Languages, based in Oregon, said that Aka and Koro were about as close as English and Russian.

English and Russian are from two different branches (Germanic and Slavic) of the Indo-European language family, but are both from the European as opposed to the Indo-Iranian divide within that language family.

From here.


Anonymous said...

Do not new languages develop wherever populations are isolated? I heard that research groups isolated in antarctic bases evolved their own languages. And I have no idea what language some Americans, such as black Cajuns, are speaking. The language of Twitter, for example, represents a overdue breakthrough, for me at least, because our youth are learning to spell words as they sound, not as inherited. Thru, for instance, is so much better than through.

Andrew Oh-Willeke said...

Historically, it takes about 800 years of separation or a major group of second language speakers to cause a new language develop, and it takes longer for them to get this different.