## LLaMA: Open and Efficient Foundation Language Models This section is under heavy development. import {Screenshot} from 'components/screenshot' import { Callout, FileTree } from 'nextra-theme-docs' import LLAMA1 from '../../img/llama-1.png' ## What's new? This paper introduces a collection of foundation language models ranging from 7B to 65B parameters. The models are trained on trillion of tokens with publicly available datasets. The work by [(Hoffman et al. 2022)](https://arxiv.org/abs/2203.15556) shows that given a compute budget smaller models trained on a lot more data can achieve better performance than the larger counterparts. This work recommends training 10B models on 200B tokens. However, the LLaMA paper finds that the performance of a 7B model continues to improve even after 1T tokens. This work focuses on training models (LLaMA) that achieve the best possible performance at various inference budgets, by training on more tokens. ## Capabilities & Key Results Overall, LLaMA-13B outperform GPT-3(175B) on many benchmarks despite being 10x smaller and possible to run a single GPU. LLaMA 65B is competitive with models like Chinchilla-70B and PaLM-540B. *Paper:* [LLaMA: Open and Efficient Foundation Language Models](https://arxiv.org/abs/2302.13971) *Code:* https://github.com/facebookresearch/llama ## References - [GPT4All](https://github.com/nomic-ai/gpt4all) (March 2023) - [ChatDoctor: A Medical Chat Model Fine-tuned on LLaMA Model using Medical Domain Knowledge](https://arxiv.org/abs/2303.14070) (March 2023) - [Stanford Alpaca](https://github.com/tatsu-lab/stanford_alpaca) (March 2023)