Sarvam AI, an emerging leader in India’s generative AI industry, has launched a new language model named Sarvam-1. Specifically designed for Indian languages, this open-source model supports ten Indian languages, including Bengali, Hindi, and Tamil, along with English. Introduced in October 2024, Sarvam1 succeeds the company’s previous model, Sarvam 2B, which was released in August 2024.
Overview of Sarvam-1
Sarvam1 features 2 billion parameters, which determine the complexity and capability of an AI model. For reference, Microsoft’s Phi-3 Mini has 3.8 billion parameters. Sarvam-1 is classified as a small language model (SLM), having fewer than ten billion parameters, unlike large language models (LLMs) such as OpenAI’s GPT4, which boasts over a trillion parameters.
Technical Specifications
Powered by 1,024 Graphics Processing Units (GPUs) from Yotta and trained using NVIDIA’s NeMo framework, Sarvam-1 addresses the challenge of limited high-quality training data for Indian languages. To resolve this, Sarvam AI developed its own training corpus, Sarvam-2T.
Training Data
Sarvam2T comprises approximately 2 trillion tokens, evenly distributed across the ten supported languages. The dataset employs synthetic data generation techniques to improve its quality. Around 20% of the dataset is in Hindi, with significant portions in English and programming languages. This diversity enables the model to perform both monolingual and multilingual tasks effectively.
Performance Metrics
Sarvam1 is reported to be more efficient in handling Indic language scripts than previous LLMs. It uses fewer tokens per word, enhancing its efficiency. The model has outperformed larger AI models like Meta’s Llama3 and Google’s Gemma2 on various benchmarks, including MMLU and ARC-Challenge.
Benchmark Achievements
In the TriviaQA benchmark, Sarvam1 achieved an accuracy of 86.11 for Indic languages, surpassing Meta’s Llama3.1 8B, which scored 61.47. Sarvam1 also boasts computational efficiency, with inference speeds 4-6 times faster than larger models such as Gemma-2-9B and Llama-3.1-8B.
Practical Applications
The strong performance and high inference efficiency of Sarvam1 make it suitable for practical applications, including deployment on edge devices. This is particularly important for real-world scenarios where computational resources may be limited.
Accessibility
Sarvam-1 is available for download on Hugging Face, an online platform for open-source AI models, allowing developers and researchers to utilize the model for various applications involving Indian languages.
%20(1).png)