From 988aa59e4d8fefa526d06f3b453ad116258398d4 Mon Sep 17 00:00:00 2001
From: Andrej Karpathy <andrej.karpathy@gmail.com>
Date: Sun, 20 Nov 2022 18:18:02 +0900
Subject: [PATCH] tune description of the repo wrt references

---
 README.md | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/README.md b/README.md
index 26fc347..603b941 100644
--- a/README.md
+++ b/README.md
@@ -5,12 +5,13 @@ makemore takes one text file as input, where each line is assumed to be one trai
 
 This is not meant to be too heavyweight library with a billion switches and knobs. It is one hackable file, and is mostly intended for educational purposes. [PyTorch](https://pytorch.org) is the only requirement.
 
-Current language model neural nets implemented:
+Current implementation follows a few key papers:
 
-- Bigram (one character simply predicts a next one with a lookup table of counts)
-- Bag of Words
-- MLP, along the lines of [Bengio et al. 2003](https://www.jmlr.org/papers/volume3/bengio03a/bengio03a.pdf)
-- RNN, along the lines of [Sutskever et al. 2011](https://icml.cc/2011/papers/524_icmlpaper.pdf)
+- Bigram (one character predicts the next one with a lookup table of counts)
+- MLP, following [Bengio et al. 2003](https://www.jmlr.org/papers/volume3/bengio03a/bengio03a.pdf)
+- CNN, following [DeepMind WaveNet 2016](https://arxiv.org/abs/1609.03499) (in progress...)
+- RNN, following [Mikolov et al. 2010](https://www.fit.vutbr.cz/research/groups/speech/publi/2010/mikolov_interspeech2010_IS100722.pdf)
+- LSTM, following [Graves et al. 2014](https://arxiv.org/abs/1308.0850)
 - GRU, following [Kyunghyun Cho et al. 2014](https://arxiv.org/abs/1409.1259)
 - Transformer, following [Vaswani et al. 2017](https://arxiv.org/abs/1706.03762)