Meta Reveals Its ChatGPT Competitor: AI LLaMA Is Coming

6 min readMar 19

The Fundamental AI Research (FAIR) team at Meta, Facebook’s parent company, has released Big Language Model Meta AI, a new “state-of-the-art” artificial intelligence (AI) language model (LLaMA).

According to CEO Mark Zuckerberg, the model will be made available to academics and will aid scientists and engineers as they explore new applications for artificial intelligence.

“We’re launching LLaMA, a new cutting-edge AI large language model meant to assist researchers progress their work,” Zuckerberg said in a Facebook post.

“LLMs have shown great potential in text production, dialogues, summarizing textual information, and more difficult tasks like solving mathematical theorems or predicting protein structure.”

Artificial intelligence advancements have become a priority for both large tech corporations and startups, with significant language models like Microsoft’s Bing AI, OpenAI’s ChatGPT, and Google’s unannounced Bard AI powering apps.

However, Meta’s LLM varies from previous models in other aspects, including its scale and availability to researchers, according to Meta.

AI LLAMA Will Include Between 7 and 65 Billion Parameters

According to Meta, LLaMA is a language model with 7 billion to 65 billion parameters.

While bigger language models have proved helpful in improving AI technology’s capabilities, they can be costly to deploy at the inference stage.

OpenAI’s Chat-GPT 3, for example, has 175 billion parameters and is more costly to utilize in inference than smaller models.

Meta has stressed the advantages of utilizing smaller models trained on many tokens or portions of words since they are easier to retrain and fine-tune for specific use cases.

The LLaMA models were trained on a whopping 1.4 trillion tokens. LLaMA 7B, the smallest LLaMA model, was trained on one trillion tokens.

“Unlike Chinchilla, PaLM, or GPT-3, we only use publicly available datasets, making our work open source compatible and reproducible, whereas most existing models rely on data that is either not publicly available or is undocumented ,”…

