One of the most widely spoken Indigenous languages in this country is now available through Google’s translation service, the first time the tech giant has included a First Nations, Métis or Inuit language spoken in Canada on its platform.
Inuktut, a broad term encompassing different dialects spoken by Inuit in Canada, Greenland and Alaska, has been added to Google Translate, which translates text, documents and websites from one language into another.
The latest addition is part of a Google initiative to develop a single artificial intelligence language model to support 1,000 of the most spoken languages in the world.
There are roughly 40,000 Inuktut speakers in Canada, data from Statistics Canada suggests.
The number of speakers alone is not enough to determine whether a language can be included in Google Translate, said Isaac Caswell, a senior software engineer with the platform.
There also has to be enough online text data to pull from to create a language model.
Other Indigenous languages in Canada have “had simply too little data to have any usable machine translation model,” said Caswell.
For example, engineers looked at adding Cree, which is spoken by more than 86,000 people in Canada, but there were fewer websites in the language to pull from.
“We don’t want to put anything on the product which just produces broken text or nonsense,” said Caswell.
“Inuktut really stands out in that it has a lot of clean and a lot of well written data, because, I think, the community is increasingly online.”
When adding a language to Google Translate, the tech company looks at two main things: whether there’s a desire or need from the community and how technically feasible is it.
After Google determined its model could recognize Inuktut, it began to consult with language speakers and organizations.
The company reached out to Inuit Tapiriit Kanatami, the national organization that represents about 70,000 Inuit in Canada, to ensure development of the model was true to the Inuktut language, including the ability to translate both of the language’s writing systems.
Inuktut uses qaniujaaqpait, or syllabics, and qaliujaaqpait, based on the Roman alphabet.
Inuit Tapiriit Kanatami has developed its own data set of common characters that can be used to write in any dialect of Inuktut to help ease written communication among the different Inuit regions.
“If we hadn’t had their help, we would have just been able to launch in syllabics, which undermines some of their current work,” said Caswell.
The organization welcomed Google’s work to include Inuktut, citing the need to revitalize, protect and promote Inuit languages.
“This is another way in which to make our language relevant, easily accessible and for those who don’t know it at all, to be able to interact with it,” Natan Obed, president of Inuit Tapiriit Kanatami, said in an interview.
“This is reconciliation in action and I really appreciate those who’ve taken the time to work with us to keep our language strong and to celebrate our language.”
With the introduction of Inuktut, Google aims to be more representative of a group often overlooked by the tech sector.
“I hope, maybe if anything, it will make them feel a little bit more seen by a big tech (company). Because, in general, Indigenous communities have had a lot of experiences being overlooked by technology,” said Caswell.
Users will have the ability to translate written Inuktut to English and vice versa through Google Translate. Other options, including the verbal translation tool, may come at a later time, said Caswell.
The use of AI in promoting Indigenous languages is not without its limitations, said Caswell, but he suspects this will change as more and more languages are unlocked with improved technology.
This report by The Canadian Press was first published Oct. 17, 2024.