Aya 101 LLM (Cohere For AI) — AI Glossary

Aya, developed by Cohere For AI, is a state-of-the-art multilingual large language model (LLM), that emerged from a global open science initiative to advance machine learning in diverse languages. The project’s core objective is to address and diminish the linguistic bias prevalent in current natural language processing (NLP) technologies, where a disproportionate focus on English has led to an underrepresentation of numerous global languages. The ultimate goal is build a series of state-of-the-art multilingual generative language models that leverage the collective wisdom and contributions of people from around the globe.

Aya 101 is the first model released in the series, and has Apache 2.0 license. It is a 13 billion parameter model. Accompanying Aya is a large multilingual instruction dataset with 513 million data points across 114 languages, aimed at addressing the needs of underserved languages. Initiated with over 3,000 researchers from 119 countries, Aya seeks to democratize AI technology globally, particularly for languages that have been largely ignored by existing models.

Key Features

Here are some key features of Aya model:

Multilingual: Supports 101 languages, significantly exceeding most open-source LLMs. This can help with research in languages often neglected by commercial models.
Open-Source Model and Dataset: Both the model and its training data are freely available for researchers to analyze, improve, and build upon.
Collaborative: Over 3,000 independent researchers from 119 countries contributed to Aya.
Technical Breakthroughs: Achieves competitive performance benchmarks compared to closed-source models.

Technical Specifications

Following are the technical specifications of the model.

Model Architecture: Based on the Transformer architecture, a standard for many LLMs, but with adjustments optimized for multilingual tasks.
Training Data: Massive dataset of text and code in 101 languages, curated through the collaborative effort.
Training Process: Employs techniques like fine-tuning and multi-task learning to adapt the model for various multilingual tasks.

The Aya model supports a broad range of languages, classified into higher-, mid-, and lower-resourced categories. This includes widely spoken languages like English, Chinese, and Spanish, and less commonly represented languages such as Afrikaans, Amharic, and Azerbaijani. The model also covers languages with various scripts and families, highlighting its comprehensive multilingual capabilities.

Multilingual Support in Aya

Aya supports 101 languages, with varying level of ‘resourcedness’. Some of the languages it has high-resourcedness are:

Arabic
Catalan
Czech
German
English
Basque
Finnish
French
Hindi
Hungarian
Italian
Japanese
Korean
Dutch
Persian
Polish
Portuguese
Russian
Spanish
Serbian
Swedish
Turkish
Vietnamese
Chinese

Full list of languages Aya 101 supports:

Afrikaans · Albanian · Amharic · Arabic · Armenian · Azerbaijani · Basque · BelarusianBengali · Bulgarian · Burmese · Catalan · Cebuano · Chichewa Chinese · Corsican · CzechDanish · Dutch · English · Esperanto Estonian · Filipino · Finnish · French · GalicianGeorgian · German · Greek · Gujarati · Haitian Creole · Hausa · Hawaiian · Hebrew · HindiHmong · Hungarian · Icelandic · Igbo · Indonesian · Irish · Italian Japanese · JavaneseKannada · Kazakh · Khmer · Korean · Kurdish · Kyrgyz · Lao · Latin · Latvian · LithuanianLuxembourgish · Macedonian Malagasy · Malay · Malayalam · Maltese · Maori · MarathiMongolian Nepali · Norwegian · Pashto · Persian · Polish · Portuguese · PunjabiRomanian · Russian · Samoan · Scottish Gaelic · Serbian · Shona · Sindhi Sinhala · Slovak · Slovenian · Somali · Sotho · Spanish · Sundanese Swahili · Swedish · Tajik · Tamil · TeluguThai · Turkish · Ukrainian · Urdu Uzbek Vietnamese · Welsh · West Frisian · Xhosa · Yiddish · Yoruba · Zulu

References

To download the model and check its dataset, see the links below:

Ready to build?

Leverage AI technologies to build your product stack

Superteams can help you build, deploy and launch AI application stacks using open source technologies — from architecture through to production.

Talk to Superteams