Wash Park Prophet: Conclusions From The Linguistic Big Picture

The big divide in the world of linguistics is between the "lumpers" (a.k.a. disciplinary liberals) like Jospeh Greenberg and Murray Gell-Mann and the "splitters" (a.k.a. disciplinary conservatives) who probably hold a majority of the discipline.

My sense of the basis of the divide is that the priority of the groupers is to find "genetic" links between languages (i.e. common origins) in pursuit of a phylogenic tree with points of origin for the dividing points. The splitters, in contrast, are more concerned about clustering. Their operational definition for a language family is based upon mutual similarity.

This explains why Australian and American languages are so controversial for linguists.

For reasons I explain below, I think that Greenberg, who has advocated for intensely controversial positions within the linguistic community, like lumping all but a couple of of the language families of the Americas into a single Amerind superfamily with a genetic origin in Siberian languages is right when it comes to the origin of these languages, and that all Australian languages likewise belong in a single macro-family. The American and Australian language family trees are probably far bushier, more intertwined and less branching than their Old World counterparts.

But, conservative linguists are right in observing that the languages of America and Australia are so diverse that acknowledging that they are all part of respective linguistic macro-families provides surprising little useful information about those languages, because the micro-language families that conservative linguists have constructed in these parts of the world, with a small number of exceptions traceable to early agricultural or proto-agricultural societies, have remarkably little in common and probably have ties to each other that are more remote in time than Old World language families despite their common origins.

One of the big sources for the disciplinary divide is probably the failure of both sides to articulate why it makes sense that there should be little intermediate language family structure in the New World.

The Near Certainty of Genetic Links Between New World Languages

From the pespective of the groupers, there is no persausive evidence that people in either place ever sat around and created a new language from scratch. With only a couple of pre-European contact exceptions in large, organized, agricultural economies that came much later than the Neolitihic Revolutions in the Near East, Indus River Valley and China did any of these people even commit their own languages to writing, a task that takes less leisure for the person creating it and takes less linguistic insight, than creating a new language from scratch. Also, humans who are raised with parental contact from fellow humans in the age range critical to acquiring language (particularly given the extended lactation period in hunter-gatherer societies) suffer major cognitive defects from which they cannot recover, die without adoptive care, and never fully recover. Therefore, there is very good reason to assume that all modern oral languages evolved in genetic relationships from prior languages. There may be more creolization leading to new languages than is commonly assumed, but there is very good reason to assume that almost none of the languages of Australia, New Guinea or the Americas were constructed or were influenced by outsiders during their populations' well established periods of near total isolation from the rest humanity.

The conclusion that all or almost all pre-Colombian American languages have genetic links to a very small number of common proto-languages of ancient migrants (probably no more than four or five and quite plausible one or two), and that all pre-contact Australian languages similar have genetic links to a very small number of common proto-languages of ancient migrants (probably not more than two) follows naturally from everything else were know from genetic and archeological evidence. The known small founding population sizes, the need for the founding population to have some level of cohesiveness in the migration period, and the known fairly narrow window for migration rule out any founding population that was significantly more linguistically diverse.

In the case of the Americas there is even an established link between a language family found closest to the Bering Straight and one of the most ancient languages spoken on the other side of the Bering Straight. So, we can establish pretty definitively that some relative of the modern Ket language was one of the important languages of the ancient migrants.

Almost all pre-Columbian American languages must share deep genetic links to some, possibly extinct language spoken at the time of migration by Paleo-Eskimos in Northeastern Russia. The evidence is overwhelming. They are genetically linked in a superfamily relative to all other human languages. Similarly, all pre-contact Australian and Tasmanian languages must share deep genetic links to a common ancestor language.

At that level, the groupers are surely right, even if we can't find the linguistic evidence.

But, the splitters have a point as well.

Australia and New Guinea and the Americas clearly have more linguistic diversity than the Old World, and if one is to judge which languages do and do not belong in a single family based upon their mutual similarities, each of these places should be broken into many more linguistic families, and those linguistic families often lack obvious intermediate phylogenic connections to a common proto-language. They don't resolve easily into splitting branches of a family tree from a common origin.

The Remarkable Youth Of Old World Languages

The remarkable thing about 28 language families of the Old World and their less than a dozen language isolates (a few tiny new ones might yet be discovered) is how young their common origins seem to be.

Modern humans evolved about 160,000 years ago and had reached the Near East about 100,000 years ago. They reached Australia (and one would think many places between the Near East and Australia) about 60,000 years ago. They reached Europe about 45,000 years ago. By 15,000 years ago they had reached Northeastern Siberia.

Dogs were domesticated roughly 15,000 years ago, they were the first domestic animal, shortly before the ancestors of Native Americans migrated to America, but after Australians, Tasmanians and New Guineans had been isolated from the rest of humanity by rising sea levels (a small number of dingos would arrive by boat in Australia from Southeast Asia much later). Food crops and animals were domesticated around 10,000 years ago (after Native Americans, Australians, Tasmanians and New Guineans were isolated from Eurasia by rising sea levels).

There was at least a 35,000 year period prior to the development of agriculture when modern humans were spread over an expanse of territory as large as Australia or the Americas, leaving plenty of time and isolation in which languages with little genetic relationship could have developed. Yet, almost all of our major modern language families appear to have much younger origins, some from less than 10,000 years ago (after farming developed), and several considerably younger than that, and a large share of them evolved in and around the far smaller geographic regions where food production evolved.

The common point of origin for the North Caucasian languages has been estimated at something on the order of 4000 years ago. It is hardly a wild guess to suppose that all of the Dravidian languages of India probably have common roots in Harrapan in the Indus River Valley Civilization, which is no more than 7000 years old. The Sino-Tibetan language family owes its great expanse to expansion Neolithic China not much more than 9500 years ago. Proto-Indo-European is no older and could be a couple of thousand years younger. We know that the Bantu language wasn't spoken widely outside the West Africa until 3500 years ago. Austronesian expansion from the island of Formosa (Taiwan) dates to 6000 years ago and the proto-Austronesian language is probably not more than 2000 years older. These language families have had three to ten cycles of language divergence as great as that from modern English to Old English, at most, to become dissimilar from a common ancestor language.

It is reasonable to assume that Eurasia has just as much linguistic diversity as the Americas and Australia and New Guinea did at first contact, and that the expansion of agricultural societies caused the vast majority of these languages to go extinct without leaving any real noticable traces in the roughly 6,000 year period between the invention of agriculture and the beginning of written history. This first wave of this process would have been running its course in the early historical era which recorded the extinctions of languages like Sumerian, Eustrucian, Elam, and Hattian. In at least a couple of cases who language families of early agricultural societies known to exist in historical times went extinct. This process was very complete, leaving us with just 40 language families (counting language isolates as their own families) many of which are on the verge of going extinct themselves.

While there was probably a large pool of potential languages from which a lucky winner could expand and form a language family, if some of the leading hypotheses about the formative locations and formation times of our major languages is correct, Uralic, Indo-European, all of the families of languages in the Caucuses, Altaic, Dravidian, Afro-Asiatic and the Nilo-Saharan may all trace their roots to a geographic area that stretches only modestly beyond the Middle East (probably not farther than the Urals, Ethiopia, North Africa and Western Iran) over a time period not longer than 5000 years.

If we can also assume that habitable space of modern humans in this area and time frame was limited significantly by geographical barriers and that there was a fair amount of communication in and intercourse and word borrowing between these people, who probably did have a common language in the distant past when they left Ethiopia to maintain through communication with each other to some extent. So, it isn't unreasonable to think that the proto-languages for these language families may not have been all that distant from each other at the time they emerged. There are plausible factors that would have limited language divergence in this space through the time period when these proto-languages emerged from their very distant common origin in an Ethiopian proto-language.

Modern humans in the early Neolithic zone in the West didn't have nearly as much room to expand and ignore their neighbors as the people in Australia, the Americas, pre-agricultural India, pre-agricultural East Asia, and pre-agricultural continental Europe did. In these tighter quarters there may have been far fewer micro-communities of neighboring tribes exchanging brides and engaging in low level war and trade, in the early agricultural region that gave rise to the language families than there were in the Americas and other pre-agricultural areas inhabited by modern humans.

This is a weaker hypothesis than an early Neolithic proto-language hypothesis popular with some "lumpers," but would produce an only moderately weaker version of the same expected effects in modern and historically known languages.

The Lack of Evidence In The New World

Another important reasons that the splitters are comfortable with the widely accepted language families it that we have access to a number of now extinct proto-languages like Latin and Sanscrit, and to historical evidence of linguistically important migrations in China and Africa that make the case for a common origin for these language families much easier to make. We aren't limited to working only with the modern versions of these languages. The proto-languages of these families were a lot closer to each other relative to other languages than the modern versions of these languages.

It is reasonable to guess that language families with as much similarity between languages as these families are probably not much older than these language families. For languages that were never committed to writing, i.e. all but two or three distinct language lineages in the Americas and none of the languages of Australia or New Guinea, we have documented proto-languages, only proto-languages inferred from modern language evidence. Even if there were oral traditions that could have substituted for written historical documentation (some historical languages in the Old World are known from religious litanies that could have been reproduced orally rather than in writing in non-literate cultures), the roughly 95% population decline experienced by almost all Native American and Australian societies after first European contact seriously disrupted this oral tradition as well.

If one insists upon having direct linguistic evidence of genetic linkages to show language family connections, one simply isn't going to find it, even if those linkages exist, in the New World, because a lack of historical linguist evidence limits the degree to which the linguistic past can be resolved.

The Likely Lack Of Intermediate Structure Or Preservation Of Information Through Language Content In The New World

The Language Evolution Time Clock

We also have good reason to believe that nothing comparable in scope and intensity to the historical conquests that produced language extinction in the Old World ever took place in Australia, Tasmania or most of the Americas.

In Australia and Tasmania, the time period between the proto-language we know must exist and the first documentation of Austalian and Tasmanian languages extends for about 50,000 years, five times as long as the oldest proto-language of an established language family.

In the Americas, the genetic evidence regarding migration to the New World suggests that 12,000 years ago "every human who migrated across the land bridge came from Eastern Siberia, and that every Native American is directly descended from that same group of Eastern Siberian migrants." Recent evidence of a small Paleo-Eskimo migration to Greenland from Siberia about 4,000 years ago that we know from archeological and genetic evidence ultimately died out without leaving much of a trace behind them (much like the Viking colonists who did the same thing 3,000 years later), does not undermine the basic thust of this conculsion.

This make the common proto-language of the Americans at least two thousand years older than that of the oldest known Old World proto-language.

If each thousand year cycle of random linguistic drift produces a certain percentage change in a language, then language change happens at an exponential, albeit slow, rate and each additional iteration blots out evidence of its origins much more completely.

The Lack Of Communication As A Factor To Prevent Language Divergence

It is likely that ancient Americans (as illustrated by a lack of technology transfers between different regions) that they were more isolated from each other than the proto-Indo-Europeans or proto-Sino-Tibetans (the oldest Neolithic civilizations) were from each other.

Peoples in regular communication with each other can be expected to diverge less linguistically (as illustrated by reduced linguistic divergence during the Roman Empire and Chinese empires) than people who are more isolated from each other.

Instead, we have good reason to believe that American hunter-gather societies formed small groups that were isolated into micro-communities for long periods of time.

We have every reason to believe that the presence of equally advanced neighbors with whom relations may not have been entirely peaceful from documented examples of hunter-gatherer communities in New Guinea, the Amazon, first European contact Nepal, first European contact Australia and first European contact Africa and India would suggest. And we know from archeological evidence that most of the people in the Americas never formed large communities that collectively interacted with each other and shares common territory as a community, so there would be a reduced need for a common language for large groups of people.

Implications For Language Family Structure

This may have inhibited overarching language families from every evolving. A group that settled in any particular part of the Americas with a hunter-gatherer economy may have a people not have had any connections to anyone other than their immediate neighbors for tens of thousands of years after their group's initial arrival and the inital filling out of the continent that coinicided with the megafauna extinctions in the Americas and Australia.

On two counts, great community age and small community size, we would expect a much greater degree of linguistic divergence between languages in the Americas and Australia (communication between groups may have been greater in more compact New Guinea) than in the Old World. So, it is hardly surprising that the modern or recently extinct languages in these countries are harder to connect into language families.

The key missing link that allows us to mutually accomodate the groupers and the splitters may be that we also know that the whole of the habitable parts of Australia and the Americas were very quickly settled in linguistic time. It took no more than a couple of thousand years for people to expand to fill these continents from the initial migration, quite possibly, less. Very soon there was no virgin territory to expand into and barring large scale local die offs of tribes by newly emerging epidemics or food supply failures or war, no group with any distinct advantage over any other group. Thus, very early on, there may have been no large scale migrations going on.

Thus, there is good reason to think that in Australia and the Americas they may be no large scale intermediate language structure between the ultimate proto-language and the large number of highly local languages and language isolates found by linguists today that it waiting to be discovered. It may simply not exist.

In almost all of the Americas there may be twelve modern English to Old English cycles of almost completely random linguistic drift, which was independent in each locality, between proto-Amerind and the indigenous American languages of today. These twelve cycles may involve so much random linguistic drift that the information about the proto-language retained in its modern descendants may be all but erased. There may be nothing we can extract from language family grouping and proto-language reconstruction today, no matter how expert our methods, that is more probing than the one cross Bering Strait connection between American Indian languages and Siberian languages that has already been inferred.

Indeed, it may be more fruitful for linguists studying indigneous American languages to start from a presumed proto-language to see what insight it can give them into the modern languages and the independent drift processes by which they came into being, than it is to try to trace the modern languages back to common sources.

Australia Compared

The Australians, from what know about their pre-history, seem to have overlapped more with each other, been confined mostly to a geographically smaller area, and migrated over larger distances than was typical in the Americas. The interaction may have created large communities in which having a common language would be useful, making linguistic drift for different languages less independent.

But, Australia was a highly fractured society for all of its pre-European contact history (there was Austronesian contact for the last millenia or two at most, but this was very slight) which was much longer than that of the Americas. Again, there isn't much to drive phylogenetic tree style deep structure linking the proto-language of Australia with its successor languages. At the time scales involved there are so many cycles of random turnover that commonalities that do exist may have more to do with loan words that arose randomly and then were exchanged between Australian languages than a common source in a proto-language.

Wash Park Prophet

18 February 2010

Conclusions From The Linguistic Big Picture

No comments:

18 February 2010

Conclusions From The Linguistic Big Picture

No comments:

Subscribe To