Commit Graph

21 Commits (master)

Author SHA1 Message Date
Andrej Karpathy 988aa59e4d tune description of the repo wrt references 2022-11-20 18:19:40 +09:00
Andrej f61811b994
Merge pull request #3 from normanyu/fix-rnn-prev-state
Fix bug in RNN where hprev always referred to start.
2022-09-15 08:26:41 -07:00
Norman Yu bf38625014 Fix bug in RNN where hprev always referred to start. Change so that hprev refers to output of previous cell 2022-09-15 18:10:19 +08:00
Andrej Karpathy 2f5e8d746e change readme 2022-09-02 20:56:16 +00:00
Andrej Karpathy c079e1ce76 add a bag of words model that looks suspiciously similar to a transformer ;) 2022-08-21 20:18:20 -07:00
Andrej Karpathy b697f434bc add an RNN and a GRU language model 2022-08-21 18:54:44 -07:00
Andrej Karpathy 6694b67d37 generalize makemore into other types of language models, and add bigram LM and an MLP LM 2022-08-21 17:53:52 -07:00
Andrej Karpathy 50617fa75d fix comment 2022-08-20 01:29:27 +00:00
Andrej Karpathy d4ede45208 implementation of InfiniteDataLoader sad 2022-08-20 01:24:44 +00:00
Andrej Karpathy a7c52cd4d0 remove some guardrails for this simple of a use case 2022-08-20 00:37:49 +00:00
Andrej Karpathy 4e0137ddf6 remove gradient clipping i dont think its needed at this small scale 2022-08-20 00:33:28 +00:00
Andrej Karpathy 35435ec087 simplify optimizer init and delete code 2022-08-20 00:32:40 +00:00
Andrej Karpathy d26d9750ee remove weight init, not needed at this scale 2022-08-20 00:30:09 +00:00
Andrej Karpathy 0a19a59564 add max steps 2022-08-20 00:29:54 +00:00
Andrej Karpathy 055e7ee48a respect multigpu envs, e.g. cuda:2 designation should work 2022-08-19 22:31:36 +00:00
Andrej Karpathy 013af92770 big refactor to make easier and api agree with mingpt more 2022-08-19 22:29:58 +00:00
Andrej 054568ec24 add some generated examples of names for fun 2022-06-09 20:55:39 +00:00
Andrej c3aaadcb16 split out train,test,new separately when reporting on sampling word identity 2022-06-09 20:55:27 +00:00
Andrej Karpathy e0a08f234c small tweaks to support the Apple Silicon M1 chip device 'mps'. But this is not yet faster because a lot of ops are still being implemented https://github.com/pytorch/pytorch/issues/77764 , in particular for us the layernorm backward as of today 2022-06-09 12:59:39 -07:00
Andrej Karpathy 8f79bd0126 first commit 2022-06-09 12:46:25 -07:00
Andrej 180c4f7260
Initial commit 2022-06-09 12:29:36 -07:00