Languages are like species. They evolve in mostly predictable ways, splitting into new species or dying out over time. Now, a group of linguists and computer scientists in the US and Canada have created a piece of software that can analyze enormous groups of languages to reconstruct what the earliest human languages might have sounded like.
It sounds like a subplot from Neal Stephenson's novel Snow Crash, but it's quite real. By using this program and others like it, linguists may one day know how people sounded when they talked 20,000 years ago, long before there was writing.
University of British Columbia statistician Alexandre Bouchard-Côté began working on the program when he was a graduate student at UC Berkeley. He used common algorithms to compare sounds and cognates — words that are the same in multiple languages — across hundreds of different modern languages.
By doing this, he could predict which language groups were most related to each other, and which kinds of sounds would be preserved most often. A sound that remained the same across distantly related languages was probably a sound that existed early in our linguistic evolutionary tree.
By putting these sounds together, Bouchard-Côté's program was able to reconstruct the sounds and words were most likely to have been used in languages from pre-history. Linguists speculate that the languages that led to today's modern ones include Proto-Indo-European, Proto-Afroasiatic and Proto-Austronesian. Bouchard-Côté and his colleagues focused on Proto-Austronesia, which led to today's Polynesian languages, as well as languages in Southeast Asia and parts of continental Asia. They were able to reconstruct over 600 ancient Proto-Austronesian languages.
In their paper, published this week in Proceedings of the National Academy of Sciences, the researchers write:
"Functional load" is a mid-twentieth century theory that suggests some sounds are more important than others in a language because they're used to distinguish between words that sound the same. For example, in the words "dog" and "tog," there's one important sound used to distinguish between them -- it's the voicing of the "d". Your tongue is in the same place to make both letters. The only difference is that "d" requires you to use your voice, and "t" is just expelling air. An important sound like that voicing is probably going to be preserved over time, because it's used in a lot places to distinguish between words.
Ultimately, this program could allow linguists to hear languages that haven't been spoken in millennia, reconstructing a lost world where those languages spread across the world, evolving as they went.
Over time, this program could be used for linguistic futurism, too. In a release, UC Berkeley cognitive scientist Tom Griffiths said:
Perfect for time travelers. Read the full research paper in PNAS.
This article originally appeared on iO9. All rights reserved.