Ancient Languages Reconstructable Thanks To New Computer Algorithm

A new method has been created that allows protolanguages to be reconstructed, to a degree, by researchers at the University of British Columbia and Berkeley. An effective new computer system has been created to recreate the ancient root languages that modern languages have descended from.


The system is, roughly, 85 percent as ‘accurate’ as the manual reconstructions done by linguists.

“We’re hopeful our tool will revolutionize historical linguistics much the same way that statistical analysis and computer power revolutionized the study of evolutionary biology,” says UBC Assistant Prof. of Statistics Alexandre Bouchard-Côté, lead author of the study.

“And while our system won’t replace the nuanced work of skilled linguists, it could prove valuable by enabling them to increase the number of modern languages they use as the basis for their reconstructions.”

Human linguists work on recreating protolanguages “by grouping words with common meanings from related modern languages, analyzing common features, and then applying sound-change rules and other criteria to derive the common parent.”

In comparison, the new system “analyzes sound changes at the level of basic phonetic units, and can operate at much greater scale than previous computerized tools.”

“The researchers reconstructed a set of protolanguages from a database of more than 142,000 word forms from 637 Austronesian languages-spoken in Southeast Asia, the Pacific and parts of continental Asia.”

“Most protolanguages do not leave written records-but in some instances reconstructions can be partially verified against ancient texts or literary histories. A notable exception is well-documented Latin, the protolanguage of the Romance languages, which include modern French, Italian, Portuguese, Romanian, Catalan and Spanish.”


The new tool could be very useful, perhaps aiding in the scientific understanding of language development and evolution. Of course the method is limited by the necessity for modern languages that have descended from the protolanguage. And the reconstructed ‘languages’ will always been incomplete, as languages change considerably over time, leaving behind much of their previous vocabulary and nature, as they change to fit the culture that they are being used by.

The new research will be published in the Proceedings of the National Academy of Sciences.

Source: University of British Columbia

Image Credits: Table courtesy of University of British Columbia; Cypriot via Wikimedia Commons

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top