Meta launched on Friday LLaMA-13B, a new AI-powered large language model (LLM) that it says can surpass OpenAI’s GPT-3 model while being “10x smaller.” Smaller AI models might lead to ChatGPT-style language assistants operating locally on devices like PCs and cellphones. It’s part of a new language model family known as “Large Language Model Meta AI,” or LLAMA for shorter.
The language models in the LLaMA collection range in size from 7 billion to 65 billion parameters. In comparison, OpenAI’s GPT-3 model, which serves as the backbone for ChatGPT, contains 175 billion parameters.
Meta trained its LLaMA models upon publicly available datasets including Common Crawl, Wikipedia, and C4, which suggests the company may possibly open source the model and weights. That’s a significant new move in an industry where the Big Tech companies in the AI race have typically maintained their most powerful AI technologies to themselves.
-project member Guillaume Lample tweeted.
(“Un-like Chinchilla, PaLM, or GPT-3, we solely use publicly accessible datasets, making our work open-source and replicable, whereas most existing models rely on data that is either not publicly available or undocumented”).
Meta refers to its LLaMA models as “foundational models,” implying that the company intends for the models to serve as the foundation for future, more refined AI models built on the technology, similarly to how Open AI constructed ChatGPT on a framework of GPT-3. LLaMA, according to the business, will be valuable in natural language research and possibly power applications like “question answering, natural language understanding or reading comprehension, understanding capabilities and limits of present language models.”
While the top-tier LLaMA model competes with similar offerings from competitor AI labs DeepMind, Google, and Open-AI, the LLaMA-13B model, as said before, can reportedly outperform GPT-3 while running on a single GPU when tested across eight standard “common sense reasoning” benchmarks such as PIQA, SIQA, , WinoGrande, Hella Swag and ARC, LLaMA-13B, in contrast to the data Centre requirements for GPT-3 variants, paves the way for ChatGPT-like performance on consumer devices in the near future.
In AI, parameter size is extremely important. An element that a machine-learning model uses to create predictions or categories based on incoming data is known as a parameter. The number of parameters in a language model is an important component in its performance, with larger models performing more complicated tasks and delivering more coherent output. Additional parameters, on the other hand, take up more space and demand more processing resources to run. Hence, if one model can accomplish the same outcomes as another with fewer parameters, it signifies a considerable improvement in efficiency.
A stripped-down edition of LLaMA is now accessible on GitHub. Meta provides a form where interested researchers may request access to the whole code and weights At this time, Meta has not announced intentions for a broader release of the models and weights.