2024-05-04-doodles-gpt - Kruzenshtern Lab

# Doodles GPT I am continuing the work that I published in [[2023-06-23-neural-network-draws-cats]]. Access to GPU resources and various other helpful tools is maturing. It is getting easier to spin up a cloud instance to run an experiment and train a model. Additionally, there is a number of optimizations that I can apply to the training process to make it faster and more efficient. I reimplemented the GPT script to include the latest known optimizations and ran the training process on a dataset with three categories: cats, cars, and parrots. The training process is ~10x faster than before, which is a significant improvement. The loss after ~1hr running on a single A10 node is 1.4454. The total number of parameters is approximately 100M. While the dataset is not a problem yet, it will become increasingly more important over time for the model to be practically useful, and this will be the next challenge to tackle. Right now, it is an amusing toy to play with. This baby GPT model is showing promising results and eager willingness to learn more and more (comparing val / train loss). A few images generated by the model are shown below: Cat: ![[cat.png]] Car: ![[car.png]] Another car: ![[car2.png]] Parrot: ![[parrot.png]]