Language models can be for Southeast Asia too, lah, as AI Singapore launches new advanced version

Google DeepMind and AI Singapore have unveiled SEA-LION v3, which is said to be a groundbreaking new language model specifically designed to serve Southeast Asia's vast linguistic landscape.

The announcement was made via Linkedin on Saturday by AI Singapore Senior Director for AI Products Leslie Teo.

Built on Google’s Gemma 2 architecture and trained on 200 billion tokens of regional data, the model represents a significant advancement in making AI more accessible and culturally relevant for Southeast Asian users.

The new model marks a considerable leap forward in multilingual AI capabilities, supporting 11 Southeast Asian languages along with English, and notably adding support for Javanese and Sudanese—which are some the most widely spoken languages in the world. 

According to early benchmarks, SEA-LION v3 has achieved unprecedented performance levels, surpassing both open-source alternatives and several larger language models in regional language tasks.

“The launch of the new Gemma-based SEA-LION v3 with AI Singapore represents a major step forward for inclusive AI,” said Manish Gupta, Google DeepMind’s Research Director. 

Gupta emphasized the model’s improved understanding of the region’s diverse languages and cultures, attributing its success to the Project SEALD collaboration with AI Singapore and other regional partners.

The model has been optimized for both instruction-following and multi-turn dialogue, making it particularly suitable for practical applications across the region. To ensure widespread accessibility, the team has made SEA-LION v3 available through multiple platforms, including Hugging Face, Kaggle, and Ollama, with both base and instruction-tuned versions ready for deployment.

Share this Post:

Accessibility Toolbar