Merge pull request #76 from guspan-tanadi/main

Model Collection GPT model description
2023-04-03 09:34:04 -06:00 · 2023-04-03 09:34:04 -06:00 · 24be38287f
parent ab3f07e281 68cfbe710c
commit 24be38287f
1 changed files with 2 additions and 2 deletions
--- a/pages/models/collection.en.mdx
+++ b/pages/models/collection.en.mdx
@ -18,10 +18,10 @@ This section consists of a collection and summary of notable and foundational LL
 | [RoBERTa](https://arxiv.org/abs/1907.11692) | A Robustly Optimized BERT Pretraining Approach | 
 | [ALBERT](https://arxiv.org/abs/1909.11942) | A Lite BERT for Self-supervised Learning of Language Representations | 
 | [XLNet](https://arxiv.org/abs/1906.08237) | Generalized Autoregressive Pretraining for Language Understanding and Generation |
-| [GPT](https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf) | Language Models are Unsupervised Multitask Learners | 
+| [GPT](https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf) | Improving Language Understanding by Generative Pre-Training | 
 | [GPT-2](https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf) | Language Models are Unsupervised Multitask Learners | 
 | [GPT-3](https://arxiv.org/abs/2005.14165) | Language Models are Few-Shot Learners |
 | [T5](https://arxiv.org/abs/1910.10683) | Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer | 
 | [CTRL](https://arxiv.org/abs/1909.05858) | CTRL: A Conditional Transformer Language Model for Controllable Generation | 
 | [BART](https://arxiv.org/abs/1910.13461) | Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension |
-| [Chinchilla](https://arxiv.org/abs/2203.15556)(Hoffman et al. 2022) | Shows that for a compute budget, the best performances are not achieved by the largest models but by smaller models trained on more data. |
+| [Chinchilla](https://arxiv.org/abs/2203.15556)(Hoffman et al. 2022) | Shows that for a compute budget, the best performances are not achieved by the largest models but by smaller models trained on more data. |