Andrej Karpathy
|
988aa59e4d
|
tune description of the repo wrt references
|
2022-11-20 18:19:40 +09:00 |
Andrej
|
f61811b994
|
Merge pull request #3 from normanyu/fix-rnn-prev-state
Fix bug in RNN where hprev always referred to start.
|
2022-09-15 08:26:41 -07:00 |
Norman Yu
|
bf38625014
|
Fix bug in RNN where hprev always referred to start. Change so that hprev refers to output of previous cell
|
2022-09-15 18:10:19 +08:00 |
Andrej Karpathy
|
2f5e8d746e
|
change readme
|
2022-09-02 20:56:16 +00:00 |
Andrej Karpathy
|
c079e1ce76
|
add a bag of words model that looks suspiciously similar to a transformer ;)
|
2022-08-21 20:18:20 -07:00 |
Andrej Karpathy
|
b697f434bc
|
add an RNN and a GRU language model
|
2022-08-21 18:54:44 -07:00 |
Andrej Karpathy
|
6694b67d37
|
generalize makemore into other types of language models, and add bigram LM and an MLP LM
|
2022-08-21 17:53:52 -07:00 |
Andrej Karpathy
|
50617fa75d
|
fix comment
|
2022-08-20 01:29:27 +00:00 |
Andrej Karpathy
|
d4ede45208
|
implementation of InfiniteDataLoader sad
|
2022-08-20 01:24:44 +00:00 |
Andrej Karpathy
|
a7c52cd4d0
|
remove some guardrails for this simple of a use case
|
2022-08-20 00:37:49 +00:00 |
Andrej Karpathy
|
4e0137ddf6
|
remove gradient clipping i dont think its needed at this small scale
|
2022-08-20 00:33:28 +00:00 |
Andrej Karpathy
|
35435ec087
|
simplify optimizer init and delete code
|
2022-08-20 00:32:40 +00:00 |
Andrej Karpathy
|
d26d9750ee
|
remove weight init, not needed at this scale
|
2022-08-20 00:30:09 +00:00 |
Andrej Karpathy
|
0a19a59564
|
add max steps
|
2022-08-20 00:29:54 +00:00 |
Andrej Karpathy
|
055e7ee48a
|
respect multigpu envs, e.g. cuda:2 designation should work
|
2022-08-19 22:31:36 +00:00 |
Andrej Karpathy
|
013af92770
|
big refactor to make easier and api agree with mingpt more
|
2022-08-19 22:29:58 +00:00 |
Andrej
|
054568ec24
|
add some generated examples of names for fun
|
2022-06-09 20:55:39 +00:00 |
Andrej
|
c3aaadcb16
|
split out train,test,new separately when reporting on sampling word identity
|
2022-06-09 20:55:27 +00:00 |
Andrej Karpathy
|
e0a08f234c
|
small tweaks to support the Apple Silicon M1 chip device 'mps'. But this is not yet faster because a lot of ops are still being implemented https://github.com/pytorch/pytorch/issues/77764 , in particular for us the layernorm backward as of today
|
2022-06-09 12:59:39 -07:00 |
Andrej Karpathy
|
8f79bd0126
|
first commit
|
2022-06-09 12:46:25 -07:00 |
Andrej
|
180c4f7260
|
Initial commit
|
2022-06-09 12:29:36 -07:00 |