makemore

Commit Graph

Author	SHA1	Message	Date
Norman Yu	bf38625014	Fix bug in RNN where hprev always referred to start. Change so that hprev refers to output of previous cell	2022-09-15 18:10:19 +08:00
Andrej Karpathy	c079e1ce76	add a bag of words model that looks suspiciously similar to a transformer ;)	2022-08-21 20:18:20 -07:00
Andrej Karpathy	b697f434bc	add an RNN and a GRU language model	2022-08-21 18:54:44 -07:00
Andrej Karpathy	6694b67d37	generalize makemore into other types of language models, and add bigram LM and an MLP LM	2022-08-21 17:53:52 -07:00
Andrej Karpathy	50617fa75d	fix comment	2022-08-20 01:29:27 +00:00
Andrej Karpathy	d4ede45208	implementation of InfiniteDataLoader sad	2022-08-20 01:24:44 +00:00
Andrej Karpathy	a7c52cd4d0	remove some guardrails for this simple of a use case	2022-08-20 00:37:49 +00:00
Andrej Karpathy	4e0137ddf6	remove gradient clipping i dont think its needed at this small scale	2022-08-20 00:33:28 +00:00
Andrej Karpathy	35435ec087	simplify optimizer init and delete code	2022-08-20 00:32:40 +00:00
Andrej Karpathy	d26d9750ee	remove weight init, not needed at this scale	2022-08-20 00:30:09 +00:00
Andrej Karpathy	0a19a59564	add max steps	2022-08-20 00:29:54 +00:00
Andrej Karpathy	055e7ee48a	respect multigpu envs, e.g. cuda:2 designation should work	2022-08-19 22:31:36 +00:00
Andrej Karpathy	013af92770	big refactor to make easier and api agree with mingpt more	2022-08-19 22:29:58 +00:00
Andrej	c3aaadcb16	split out train,test,new separately when reporting on sampling word identity	2022-06-09 20:55:27 +00:00
Andrej Karpathy	e0a08f234c	small tweaks to support the Apple Silicon M1 chip device 'mps'. But this is not yet faster because a lot of ops are still being implemented https://github.com/pytorch/pytorch/issues/77764 , in particular for us the layernorm backward as of today	2022-06-09 12:59:39 -07:00
Andrej Karpathy	8f79bd0126	first commit	2022-06-09 12:46:25 -07:00

16 Commits (master)