A New Tool for the World Language Process:
Accelerated Co-evolution of a Universal Auxiliary Language
via Corpus Linguistic Analysis of World Englishes.

Jonathan B. Britten
Nakamura University
Japan Chancellor, World Language Process

5-7-1 Befu, Jonan-ku
Fukuoka, 814-0104

Corpus linguistic analysis of World Englishes (WE) may provide a powerful tool for accelerating the goals of the World Language Process (WLP) . The WLP is an international non-governmental organization dedicated to development of a Universal Auxiliary Language (UAL). (Britten 2002)

Over several decades the WLP has made steady progress, yet has not yet adapted the tools of corpus linguistics in pursuit of a UAL. Using the increasingly powerful method of corpus linguistics ( an academic tool greatly aided by the development of the Internet) may dramatically accelerate the co-evolution of a true UAL shared equally by all persons.

In this proposed approach, corpus linguistic analysis would work in two ways: first, in a "top-down" manner, corpus linguistic analysts would evaluate existing and emerging WE in an effort to identify patterns and trends that would contribute to a UAL; next, in a "bottom-up" manner, WLP volunteers would introduce changes in the existing UAL educational curriculum, and then exploit corpus linguistic analysis of WE to see what happens. This methodology could provide a kind of living language laboratory to test the success or failure of WLP-directed efforts to influence an emerging UAL.

Ideally, this dual mechanism would create a beneficial and self-reinforcing co-evolutionary feedback loop that would greatly accelerate the emergence of a UAL. To the best of my knowledge, corpus linguistics has never been used in such a fashion, so an attempt to do so could create considerable interest among experts in that field.

Some linguists consider the emergence of a UAL a remote possibility ; one linguist has argued that, "all such attempts are now considered linguistic esoterica, mere symbols of the desire for universalist thinkers for a code of communication that would cut across cultures." (Kachru, 1992, p.2). However, this negative view is based on the failure of constructed languages, rather than an explicit criticism of the concept of a co-evolved UAL based on an existing and spreading natural lingua franca, and such an approach is the basis of the WLP's simplified English core. Indeed, Kachru unabashedly believes that US President John Adams (1735-1826) was a "visionary" for his prediction of English's global spread. Yet even supporters of the WLP's general goals speculate that full global fluency in a UAL might emerge only after seven centuries of gradual development. (Alexander 2004)

In contrast to those who see a UAL as a distant possibility at best, I believe that the expanding "global village" gives modern volunteers a unique opportunity to achieve a goal that has proven elusive for at least several centuries. If this speculation is correct -- that technological changes (among other factors) can greatly accelerate the UAL process -- then corpus linguistic analysis of World Englishes could contribute substantial depth and range to the WLP approach, creating a worldwide foundation for a UAL within a century, and greatly accelerate the goal of global fluency in a UAL.

The basis for this hypothesis is easy to appreciate. Advances in the science of corpus linguistics during the past few decades are well known to language-teachers and researchers, textbook and dictionary writers, and others concerned with language. (Teachers and their students arguably have been among the primary beneficiaries of this pragmatic approach to real-world language use.) At the same time, I would argue, corpus linguistics has only just begun to fulfill its vast potential, which currently benefits far too narrow a segment of the population. I believe that the entire global population could benefit if corpus linguistics aids in the rapid, co-evolution of a true UAL.

To explain the reasoning behind this proposal, it is useful to quote Marshall Childs, an educator and linguist who writes a regular column for the Daily Yomiuri newspaper. The quotation below concerns corpus linguistics apropos of "standard" English instruction, but the points are entirely relevant to analysis of WE in pursuit of a UAL.

When we want to talk about patterns of language - about canonical forms - modern technology permits us to escape from our prior dependence on grammatical rules. Rules are oversimplifications anyway. Corpus linguists, those experts who analyze large volumes of language in use, are poised to take the place of grammarians. They gather irrefutable and sometimes surprising data about actual usage.

Corpus linguists always win over grammarians. In a submission for this column a couple of years ago, I wrote, "I shall discuss..." and the editor substituted "I will discuss...," saying The Daily Yomiuri stylebook specifies "I will" in the first-person singular. I argued that Mr. Johnston, my sixth-grade teacher, said "I shall" is correct. The answer was "Look in a dictionary." Sure enough. In the years since Mr. Johnson ruled, acceptable usage has changed. I had to surrender to the superior authority of the corpus linguists who wrote the dictionary.

I know it is a shock for textbook writers to have to change, but textbooks based on pragmatic principles must take their canonical forms not from grammar books but from corpus studies. That change has a positive side. Corpus studies tell us the frequency of usage of various forms. Using frequency information, textbook writers can organize their material in sequences that are natural and of greatest use to learners.

It should be a lesson to us that winning curriculums are those that are organized in the easiest ways, along paths of least resistance. If pragmaticists have a superior principle of organization, the burden is on them to prove it. They must show not only that the new organization is more effective than the old, but also that it is easier to write, teach and administer. If the new curriculum appeals to people who are busy, it will win. If it demands extra time and effort, it will lose, no matter how highly principled it is.

In summary, pragmatics, supported by theory. . . is more likely than previous approaches to fulfill the promises we make to beginning language students, that they will be able to use a new language to communicate with new people.

Substitute the word "UAL" for "textbook(s)" and "curriculum" (boldface and underline added above) in Child's article, and the potential merits of corpus linguistic analysis of WE becomes clearer. Naturally, the success of this approach depends on the continued global spread of WE, which WLP volunteers believe is a very likely trend.

The increasing global use of "standard" English as a lingua franca in business, science, diplomacy and education is already well documented. Likewise, the continued spread of WE studies suggests that an emerging UAL can plausibly be founded on an emergent blend of "lingua franca English" (my term) and WE. For this reason, the key to accelerated co-evolution of a UAL is likely to be an ongoing corpus linguistic analysis of evolving World Englishes.

If the above scenario is valid, WLP volunteers could successfully exploit corpus linguistic data to help direct the evolution of a UAL. This concept extends considerably beyond the well-received work of Jennifer Jenkins regarding pronunciation of English as an international language. (Jenkins 2000) (I am basing my comments here on the publisher's synopsis and editorial reviews, as I learned of the text only recently.) Jenikins' primary concern is with phonology and the teaching of pronunciation, and reviews of the book indicate that this concept has been well received and is considered innovative. This reception is encouraging, though the effort to co-evolve a UAL based on WE would necessarily go much further than phonological adjustments; we can reasonably expect that an emergent UAL based on English would also involve major lexical, syntactic, and discoursal evolutionary changes.

Such an evolution would touch on a great many topics of pressing interest to students of WE. Comparing the novel WLP co-evolution concept with a recent call for papers by the International Association for World Englishes (IAWE), we can recognize many shared concerns. The following topics are listed in the call for the IAWE 2005 Call for Papers:

    Studies of world Englishes, such as anthropological, critical, 'features- based', and sociolinguistic studies

    The history of world Englishes

    Issues in the linguistic description and analysis of Englishes

    Corpus linguistics and world Englishes

    Contact linguistics - the interface between Englishes and other languages

    English-based pidgins and creoles

    Code-switching, code-mixing and linguistic borrowing

    Discourse analysis, genre analysis, and discourse strategies

    English in media and advertising

    Language planning and politics

    Power, ideology and identity

    Evaluation, testing, and intelligibility

    Second and foreign language acquisition and pedagogy

    Bilingual creativity in English literatures

    World Englishes in the classroom

Space limitations preclude a detailed discussion of the relevance of these topics, but it is fair to say that all of these topics have relevance to this essay, and that the proposal to co-evolve a UAL based on WE provides an interesting organizing principle for considering all of these topics from a perspective that is perhaps new to many WE experts.

Of very special interest regarding this proposal is the "contact linguistics" mentioned (and underlined) above, for this study would provide a plausible means by which elements of languages other than English (LOTE) might be incorporated in the co-evolving UAL more rapidly than we might expect. In other words, WE could serve as a kind of filter by which those LOTE elements most readily acceptable worldwide - as measured by corpus linguistic analysis - could gain a foothold. If this proves to be true, the emerging UAL would be far more inclusive of various world languages and cultures than if WLP volunteers and other corpus linguists were to ignore this opportunity.

Again, the concept is easy to appreciate: if we want to help people worldwide to have a globally acceptable UAL, and we wish to make that UAL as inclusive as is possible, we can proceed by pointing out and perhaps encouraging (rather than imposing, as in constructed languages) certain emerging and plausibly acceptable/survivable changes to the emergent UAL. These suggestions -- essentially the kind of guidance people worldwide regularly seek from their dictionaries and grammar books --- would emerge from careful and cooperative analysis of WE, in order to identify evolving trends.

This approach neither demands nor excludes LOTE, but rather assumes that the most probable and successful LOTE inclusions would naturally derive from the process of LOTE populations developing their own varieties of WE.. In other words, WE might serve as a kind of semi-permeable linguistic membrane, allowing entrance to certain LOTE words, concepts, and syntactical structures, but excluding others through a natural process of linguistic filtration. Indeed, "standard" English, very much a hybrid language, already provides countless examples of the process by which this kind of variation occurs.

It might be very difficult to initiate this co-evolutionary proposal if it were not for the fact that at least some of the needed corpus linguistic tools and databases already exist. Of particular importance are the International Corpus of English (ICE) and the University of Birmingham's Global English Monitor Corpus (GEMC). Both are highly relevant to the proposal.

The ICE, was established in 1990, with the primary goal of "collecting materials for comparative studies of English worldwide." Currently, twenty research teams worldwide "are preparing electronic corpora of their own national or regional variety of English. Each ICE corpus consists of one million words of spoken and written English produced after 1989. . . . " (ICE 2004 (web site) ) The ICE would clearly be an extremely useful resource in the effort to co-evolve a UAL through corpus-linguistic analysis of global varieties of English.

The GEMC, established in 2001, is described as "an electronic archive of the world's leading English newspapers," and seeks to analyze changes in English thorough a study of discourse printed in them. The goal is to monitor "changes of attitudes and beliefs," and the Corpus aims to become "a prime information source for everyone with an interest in social affairs, both in social studies and in governance." This relatively new corpus may prove to be a valuable resource for the WLP, particularly apropos of more subtle linguistic changes to globalized English.

There are many other corpora that might be useful. Researchers in Japan may be particularly interested in the ongoing Japanese EFL Learner Corpus (JEFLL) (Tonio (2004), as cited in Morrow, 2004). There are other similar learner corpora projects ongoing, notably at Cambridge University, which maintains a global learner corpus based on Cambridge ESOL exams given around the world.

Ideally, the WLP itself might eventually find expertise and resources to build a meta-corpus, comprised of many other corpora, for use in co-evolving an emergent UAL. But the salient point is that WLP volunteers who want to exploit the power of corpus linguistics need not reinvent the wheel: a great deal of first-rate work has already been completed, albeit absent the WLP's organizing purpose - co-evolution of a UAL. How then would WLP volunteers use corpus linguistic analysis of World Englishes to accelerate their goal?

We can begin with a hypothetical scenario. Let's suppose that corpus linguistic analysis of WE -- perhaps a meta-analysis of many related databases -- makes clear that the "standard" English articles "a", "an", and, "the" are never-ending source of difficulty even for advanced "lingua-franca English" students, all around the world. (My personal observations as a university teacher in Japan confirm the well-known difficulty of teaching English articles; this is verified by Snape (2003) The details are of less importance here than the observation that articles pose a demonstrable difficulty for EFL/ESL learners, and could therefore plausibly be considered a grammatical feature of English that might well diminish or disappear in a WE-based UAL. We might first see evidence of this in corpus-linguistic analyses of WE.

If corpus linguistic analysis of WE demonstrated that difficulties with articles exist globally, and are reflected in altered or diminished use of such articles, whether accidental or deliberate, then we might expect such analysis to provide at least some additional clues about the manner in which WE taken as a whole are evolving to deal with articles. These clues would be crucial tools for WLP volunteers seeking to co-evolve a UAL.

Let us imagine that over several years WLP volunteers and others observe a statistically significant reduction in the use of articles in WE. In such a case, WLP members might analyze the benefits and detriments of this change. Though WLP members would not necessarily prescribe or proscribe any particular variation, it is reasonable to suppose that some sort of advisory function would emerge. In cooperation with other experts, WLP volunteers might even establish an online meta-corpus, from which dictionaries of the emerging UAL might emerge.

Let us further assume that at this time, WLP volunteers carrying out extensive global language education based on the existing model (a rationalized orthography known as ANJeL Tun (pronounced "Angel Tongue" ) and a simplified English grammar based on Ogden's BASIC) (Britten 2002). In a coordinated change of curriculum, WLP volunteers could simply eliminate the use of articles in WLP educational materials worldwide. I am calling this a "bottom up" co-evolutionary influence on the UAL.

Next, volunteers would need to analyze the influence of this "bottom up" aspect of the proposal; corpus linguistic analysis would make clear whether this educational change is having any influence on WE, and if so, precisely what that change is. If the bottom-up co-evolutionary "interference" in the "natural" course of evolution were shown to accelerate the pre-existing trend, WLP volunteers would have a clearer idea about the potential for other possible means to influence the emergent UAL.

We can only speculate about the feasibility of introducing "bottom up" co-evolutionary change to a UAL, but there are several reasons to believe the WLP has a strong potential to do so. The first is the aim of the WLP to introduce very rapid, low-cost video-based teaching methods using volunteer "each one teach one" methods, and in innovate method known as Auxiliary Closed Captioned English with Simplified Spelling (ACCESS). If this methodology, developed to be rapid and inexpensive, is proven successful in further global testing, the potential to rapidly educate people around the world would provide a strong foundation for the bottom-up influence. A supporting factor is demographic: many of those who need and would receive ACCESS instruction in ANJeL Tun will be among the world's largest demographic group. As one scholar with the Carnegie Endowment for International Peace recently wrote, "Demography is, in fact, destiny. Half the people in the world today are under 24. Of these, nearly nine out of 10 live in the developing world. A billion of them will need jobs in the next decade - 60 percent of them are in Asia, 15 percent are in Africa. For them, the choices are simple: dignity or desperation, a job or starvation." (Rothkoph, 2004). United Nations data also makes clear that much of the world's growing population is unable to access even basic education. (UN Population Fund 2002)

The WLP puts a special emphasis on reaching such persons, who are often illiterate even in their native languages. Thus, WLP volunteers, based on sheer demographics, have a great potential to introduce "bottom -up" co-evolutionary change into a UAL. Again, though, it is only through proper corpus-linguistic analysis that we can evaluate these changes, see what is likely to survive in the marketplace of global communication and employment needs, and accelerate the process of co-evolution.

Bringing the initial hypothetical scenario to a conclusion, through a top-down/bottom-up co-evolutionary process, we might see either a greatly diminished role of articles in a UAL as it evolves from World Englishes, or articles might disappear completely (making the emerging UAL more like Japanese in at least one respect.)

If this kind of dual-level co-evolutionary process emerges, it would be natural for corpus linguistic studies of the evolving UAL to form the basis for establishing and accelerating pragmatic guidelines ensuring international consistency, coherence, and intelligibility. In this way, a reasonably standardized UAL might naturally appear more rapidly than most linguists believe possible, a successful living language rather than yet another failed or marginal constructed language. Again, changes would result primarily from corpus linguistic analysis of living WE, aided by interested persons cooperating to co-evolve a UAL, rather than from an imposed construction such as those of Esperanto and the many other lesser-known constructed languages.

An important point about corpus linguistic analysis of WE is that the changes from "standard" English have already been shown to be quite different from expectations of linguistic scholars. (Nelson 2004). The point here again is that we may be able to predict much about and impose very little on a UAL. We may be able to succeed in a bottom-up introduction of trends, as discussed above, but unless we start with a consideration of corpus-linguistically determined reality, and subject any introductions to a similar analysis, we are most likely to fail or at best to succeed only marginally.

The discussion above emphasizes objective corpus linguistic analysis and co-evolutionary development for a simple reason: all attempts to impose linguistic change through highly constructed languages have been unsuccessful or at best marginally successful; even the most successful constructed language, Esperanto, claims no more than a two million speakers, putting it on a level with Hebrew and Lithuanian. (Esperanto Net 2004) This is in marked contrast with the hundreds of millions of "inner circle" English speakers and rapidly growing "outer circle" and "expanding circle" populations, which include hundreds of millions more. Although the number of English speakers is a controversial topic (Wallraff ), the main point is that no constructed language comes even remotely close to the number of persons having at least some familiarity with English/WE.

Very recent events help is to understand why even the most successful constructed language has met with only limited success; we need only consider the recent, largely failed attempt to reform an established national language, German. The failures are illustrative of the difficulties certain to face those who seek to impose even the best-researched, planned, and supervised linguistic changes. The details are too involved to give here, but the key point is that a recent high-level government-sponsored program, led by experts in education, linguistics, and other fields has evidently failed, despite six years of careful planning and full support, According to the head of the German Publishing Institute, "We have a total mess. It's anarchy, an untenable situation." (AP, 2004) According to the AP report, major German newspapers are abandoning the effort and returning to the old way of writing. Critics of the effort had insisted that the project would fail, and that change should be left to evolution. This evident failure should be a strong warning to all persons involved in the goal of achieving a UAL through excessively planned constructs. At the same time, the German experience might also provide a way to test the ideas in this essay: perhaps German scholars who hope to salvage something of their extensive and expensive reform program can use corpus linguistics as a means to do so.

The key point is that WLP volunteers seeking an accelerated co-evolution of a UAL may be most successful if they seek mainly to direct and channel the flow of linguistic change. History also suggests that the WLP volunteers made a wise decision not to impose an artificial order beyond the well devised orthographic and grammatical changes of English that are currently manifest in ANJeL Tun.

Humility and objectivity will certainly be crucial to success, and success is very much to be desired: the world is changing very rapidly, and the utility of a UAL is increasingly clear to many observers. The cost of failure in our race between education and calamity is too awful to contemplate. Humility we must provide for ourselves, but corpus linguistic methods can provide the objectivity, and this is very possibly a tool that is likely to contribute greatly to the long-sought goal of a successful Universal Auxiliary Language.


