How AI is powering the preservation of African languages

“Google Translate breaks down language barriers to help people connect and better understand the world around them.” Photo | net

What you need to know:

  • Young African scholars are also rising up to the challenge with similar initiatives, leveraging the power of AI. Nearly a quarter of all the recently added  languages on the platform are African and Africa now has more than 50 languages on the translation service

More African languages are finding their way to Google’s online translation service, as the giant search engine integrates Artificial Intelligence to learn closely related languages.
This year, the search engine has made its largest venture into the translation of African languages - and seen the highest number of new languages added to the service - ever.
“We are using AI to expand the variety of languages we support. Thanks to our PaLM 2 large language model, we’re rolling out 110 new languages to Google Translate, our largest expansion ever,” said Google Translate Senior Software Engineer, Isaac Caswell.
This development marks a pivotal moment that not only offers to popularise indigenous languages but also facilitates the development of a comprehensive local linguistic resource.
Nearly a quarter of all the recently added  languages on the platform are African and Africa now has more than 50 languages on the translation service.
The new language additions include Dholuo, spoken by Kenya’s fourth largest ethnic group, the Luo, with more than 4.2 million speakers across several Nilotic ethnic groups found in Egypt, Sudan, South Sudan, Ethiopia, Northern Uganda, eastern DRC, and a part of Tanzania.
Another is Afar, a tonal language spoken by 2.3 million people in Djibouti, Eritrea, and Ethiopia. Google noted that of all the languages in this launch, Afar had the most volunteer community contributions.
Another addition is N’Ko, a standardized form of the West African Manding languages which unifies many dialects into a common language. Its unique alphabet was invented in 1949, and it has an active research community that develops resources and technology for it today.
Tamazight (Amazigh), a Berber language spoken across North Africa, is another important new addition. Although there are many dialects, the written form is generally mutually understandable. It is written using both Latin and Tifinagh script, with Google Translate supporting both.
“Google Translate breaks down language barriers to help people connect and better understand the world around them. We are always applying the latest technologies so more people can access this tool,” Caswell explained.
Other African languages added this year include, Fon, Kikongo, Ga, Swati, Venda and Wolof.
In 2022, Google added 24 new languages across the world using Zero-Shot Machine Translation, where a machine learning model learns to translate into another language without ever seeing an example.
While Google said languages have an immense amount of variation ranging from  regional varieties, dialects, different spelling standards making it almost impossible to pick a “right” variety, Its approach prioritized the most commonly used varieties of each language.
“PaLM 2 was a key piece to the puzzle, helping Translate more efficiently learn languages that are closely related to each other. As technology advances, and as we continue to partner with expert linguists and native speakers, we’ll support even more language varieties and spelling conventions over time,” explained Caswell.
According to Google, these new languages represent more than 614 million speakers, opening up translations for around 80 per cent of the world’s population. Some of these languages are major world languages with more than 100 million speakers, while others are spoken by small Indigenous communities.
A few of the languages have almost no native speakers but are undergoing active revitalization efforts.
Swahili is the most widely spoken African language with the United Nations placing the number of speakers at more than 200 million. In 2021 the UN designated July 7 as World Kiswahili Language Day.
This year’s event is hosted by Kenya under the theme “Kiswahili, Multilingual Education and the Enhancement of Peace.”
Organisers of the event, East Africa Community and Kenya government said the annual event offers a platform for Kiswahili stakeholders to share knowledge, research-based evidence, best practices, experiences, and worldviews on the role of Kiswahili education in promoting a culture of peace.
The East Africa Community Deputy Secretary General (DSG) in charge of Infrastructure, Productive, Social and Political Sectors,  Andrea Aguer Ariik, emphasised the significance of language diversity and unity in the EAC.
“Kiswahili, as widely spoken language in East Africa, not only bridges communication gaps but also represents a common identity among the member states of the EAC,” said Ariik in a statement.
And it is not Google playing in this field alone, Young African scholars studying abroad are also rising to the challenge with similar initiatives leveraging the power of AI.
Ife Adebara, a programmer and scholar at the University of British Columbia’s linguistics department, is among those leading initiatives to deploy AI in preserving local languages, with a focus on African languages.
Her project, Afrocentric Natural Language Processing, aims to raise awareness and develop tools and programs that are accessible to speakers of African languages such as Swahili and Zulu.