When Google announced the $10 Bn Google For India Digitization fund in July this year, one of its focus areas was investing in the Indian language internet ecosystem to improve penetration of digital services and products beyond the metros and Tier 1/2 cities of India. And in line with that, the company has now unveiled a new AI-powered language model called Multilingual Representations for Indian Languages (MuRIL) to improve interoperability of web services in 16 Indian languages.
MuRIL will work for more than a dozen Indian languages — Assamese, Bengali, English, Gujarati, Hindi, Kannada, Kashmiri, Malayalam, Marathi, Nepali, Oriya, Punjabi, Sanskrit, Sindhi, Tamil, Telugu, and Urdu.
Besides improving the translation of internet content in many languages, MuRIL would allow search users to easily switch search results from English to Tamil, Telugu, Bangla, Marathi apart from Hindi. Google would also show content in five Indian languages — Hindi, Bangla, Marathi, Tamil, and Telugu — even when the query is in English, while Google Maps would also support one of nine Indian languages regardless of the system language.
Leveraging the language model, Google’s image recognition and intelligence app Google Lens would help users get solutions to math problems in Hindi and English. Google Lens will show step-by-step guides and videos to help explain the problem.
Will Google’s Focus Drive Indian Regional Language Adoption?
MuRIL is said to help computer systems understand different Indian languages to solve language complexities involved in transliteration and understanding the context and sentiment behind statements — whether it involves negative sentiments or positive ones, whether it pertains to a person or a place and more — in multiple languages.
Developers can leverage the language models for third-party use as it has been open sourced through the TensorFlow Hub.
While Google is making efforts to improve the penetration of regional language internet content, it has also made some policy changes recently that could hamper their growth. In October, it added a new clause in its publishers policy to stop monetisation of new content, web pages and applications in regional languages that are not covered by the company’s support languages.
Google’s support services cover Indian languages like Hindi, Bengali, Malayalam, Marathi, Gujarati, Telugu, Urdu and Kannada along with English. The publisher policy will come off as a disadvantage for content creators and entrepreneurs that usually caterers to these demographics and in their local languages.
From March 2021, Google said it would not allow new sites, web pages and apps to be monetised using AdSense, Ad Manager and Ad Mob if they are not primarily in languages that the company offers customer support in.