Commit Graph

16 Commits (master)

Author SHA1 Message Date
Norman Yu bf38625014 Fix bug in RNN where hprev always referred to start. Change so that hprev refers to output of previous cell 2022-09-15 18:10:19 +08:00
Andrej Karpathy c079e1ce76 add a bag of words model that looks suspiciously similar to a transformer ;) 2022-08-21 20:18:20 -07:00
Andrej Karpathy b697f434bc add an RNN and a GRU language model 2022-08-21 18:54:44 -07:00
Andrej Karpathy 6694b67d37 generalize makemore into other types of language models, and add bigram LM and an MLP LM 2022-08-21 17:53:52 -07:00
Andrej Karpathy 50617fa75d fix comment 2022-08-20 01:29:27 +00:00
Andrej Karpathy d4ede45208 implementation of InfiniteDataLoader sad 2022-08-20 01:24:44 +00:00
Andrej Karpathy a7c52cd4d0 remove some guardrails for this simple of a use case 2022-08-20 00:37:49 +00:00
Andrej Karpathy 4e0137ddf6 remove gradient clipping i dont think its needed at this small scale 2022-08-20 00:33:28 +00:00
Andrej Karpathy 35435ec087 simplify optimizer init and delete code 2022-08-20 00:32:40 +00:00
Andrej Karpathy d26d9750ee remove weight init, not needed at this scale 2022-08-20 00:30:09 +00:00
Andrej Karpathy 0a19a59564 add max steps 2022-08-20 00:29:54 +00:00
Andrej Karpathy 055e7ee48a respect multigpu envs, e.g. cuda:2 designation should work 2022-08-19 22:31:36 +00:00
Andrej Karpathy 013af92770 big refactor to make easier and api agree with mingpt more 2022-08-19 22:29:58 +00:00
Andrej c3aaadcb16 split out train,test,new separately when reporting on sampling word identity 2022-06-09 20:55:27 +00:00
Andrej Karpathy e0a08f234c small tweaks to support the Apple Silicon M1 chip device 'mps'. But this is not yet faster because a lot of ops are still being implemented https://github.com/pytorch/pytorch/issues/77764 , in particular for us the layernorm backward as of today 2022-06-09 12:59:39 -07:00
Andrej Karpathy 8f79bd0126 first commit 2022-06-09 12:46:25 -07:00