Govt Unveils State-Backed Multimodal LLM For Indian Languages

Govt Unveils State-Backed Multimodal LLM For Indian Languages

SUMMARY

Minister Jitendra Singh described BharatGen as a “national mission to create AI that is ethical, inclusive, multilingual, and deeply rooted in Indian values and ethos”

The platform integrates inputs such as text, speech, and image and offers AI solutions in 22 Indian languages

The open-source BharatGen platform is multilingual and multimodal and has been trained on Indigenously built datasets

Union minister Jitendra Singh has launched the state-backed multimodal large language model (LLM) for Indian languages. 

Unlike traditional LLMs that predominantly take text as input, multimodal AI models can process various types of data, including text, images, audio and video. 

Speaking at the BharatGen Summit, Singh described BharatGen as a “national mission to create AI that is ethical, inclusive, multilingual, and deeply rooted in Indian values and ethos.” The platform integrates inputs such as text, speech, and image and offers AI solutions in 22 Indian languages.

“This initiative will empower critical sectors such as healthcare, education, agriculture and governance, delivering region-specific AI solutions that understand and serve every Indian,” said Singh.

The AI model was developed under the aegis of National Mission on Interdisciplinary Cyber-Physical Systems (NM-ICPS) and implemented via IIT Bombay’s technology innovation hub (TIH) foundation for internet of things (IoT) and internet of everything (IoE). 

The project is also backed by the department of science and technology (DST) as well as a consortium of academic institutions, experts and innovators.

What Is BharatGen?

In October 2024, the Union government announced the BharatGen project, touted as the world’s first government-funded multimodal LLM project. At the time, the Centre said that the platform would focus on creating “efficient and inclusive AIs” in Indian languages and would be able to generate high-quality text and “multimodal content” in various Indian languages once completed.

The BharatGen platform has four key “distinguishing features”: 

  • Multilingual and multimodal nature of foundation models
  • Indigenously built datasets, which will be leveraged to train the LLMs
  • Open-source architecture
  • Development of an ecosystem of GenAI research in India

In order to “deeply capture” the nuances of Indian languages and address the paucity of data sets in Indic languages, the government then had said that BharatGen would develop processes for collecting and curating India-centric data that represents the country’s diverse languages, dialects, and cultural contexts.

As per the government, the project will also focus on data-efficient learning, particularly for Indian languages with limited digital presence. Besides, the open architecture of the platform will enable smaller businesses to leverage the platform to build products on top of this tech stack and linguistic datasets. 

The project is part of the Centre’s larger push for the AI ecosystem. Earlier this year, the government selected SarvamAI to build the country’s first indigenous foundational AI model. Last week, three more AI startups, Soket AI Labs and Gan.ai, were also selected to build homegrown LLMs

The projects are part of the government’s broader INR 10,372 Cr IndiaAI Mission, which was approved by the Union cabinet last year. The Missions aims to facilitate funding for emerging AI startups and spur innovation in the sector.

At the heart of all this is the growing Indian AI landscape, which as per Inc42 data, is projected to be a $17 Bn market opportunity by 2030.

Note: We at Inc42 take our ethics very seriously. More information about it can be found here.

You have reached your limit of free stories
Join Us In Celebrating 5 Years Of Inc42 Plus!

Unlock special offers and join 10,000+ founders, investors & operators staying ahead in India’s startup economy.

2 YEAR PLAN
₹19999
₹5999
₹249/Month
UNLOCK 70% OFF
Cancel Anytime
1 YEAR PLAN
₹9999
₹3499
₹291/Month
UNLOCK 65% OFF
Cancel Anytime
Already A Member?
Discover Startups & Business Models

Unleash your potential by exploring unlimited articles, trackers, and playbooks. Identify the hottest startup deals, supercharge your innovation projects, and stay updated with expert curation.

Govt Unveils State-Backed Multimodal LLM For Indian Languages-Inc42 Media
How-To’s on Starting & Scaling Up

Empower yourself with comprehensive playbooks, expert analysis, and invaluable insights. Learn to validate ideas, acquire customers, secure funding, and navigate the journey to startup success.

Govt Unveils State-Backed Multimodal LLM For Indian Languages-Inc42 Media
Identify Trends & New Markets

Access 75+ in-depth reports on frontier industries. Gain exclusive market intelligence, understand market landscapes, and decode emerging trends to make informed decisions.

Govt Unveils State-Backed Multimodal LLM For Indian Languages-Inc42 Media
Track & Decode the Investment Landscape

Stay ahead with startup and funding trackers. Analyse investment strategies, profile successful investors, and keep track of upcoming funds, accelerators, and more.

Govt Unveils State-Backed Multimodal LLM For Indian Languages-Inc42 Media
Govt Unveils State-Backed Multimodal LLM For Indian Languages-Inc42 Media
You’re in Good company