CAMBRIDGE (IFLScience)—A multidisciplinary research team from the Massachusetts Institute of Technology has examined the languages of the world and categorized them on how widely certain forms of media are translated into other languages, IFLScience reports.
The researchers began to form their Global Language Network by identifying sources of media that had been translated into multiple languages. This included analyzing data from books, Wikipedia, and Twitter. The data set for the books included 2.2 million volumes that represented over 1,000 languages. Books that were translated from one language into another were connected in their data map. Articles on Wikipedia that had been edited by humans, not bots, were analyzed to see if editors were writing in multiple languages. The Twitter data consisted of tweets sent by 17 million users, spanning 73 languages. If a Twitter user sent out tweets in multiple languages, those languages were connected.
Armenian was the 48th most influential language in the books translations rankings, coming in 59th for Wikipedia results, and 57th for Tweets.
English was the largest hub for information to be translated from one language into another in all three data sets. Russian, German, and Spanish also serve as hubs to other languages, but to a lesser extent compared to English.
“Of the many languages that have ever been spoken, only a few of them have been able to achieve global prominence; they have been important enough to become a global language,” Cesar Hidalgo, who led the research, said.
In the visualization, languages are represented by circles that are sized according to either its number of native speakers, the GDP per capita of that language’s speakers, or its Eigenvector Centrality, a measure of influence within networks. The circles are color-coded according to each language family (English is an Indo-European language, for instance, while Arabic is Afro-Asiatic).