40
Improving GPT-2 with Self-Distillation, Sparse Attention, and PEFT
Reimplementation of GPT-2 exploring self-distillation, sparse attention, and parameter-efficient fine-tuning (LoRA) for paraphrase detection and sonnet generation.
Read more
Some of the projects are from work and some are on my own time.