GPT OSS from Scratch - Inference Huggingface Model
Testing out the huggingface version for the gpt-oss-20b locally on consumer hardware
Testing out the huggingface version for the gpt-oss-20b locally on consumer hardware
Regularization to improve duplication penalty loss
How Triton Compiler Works Under the Hood!
Enough MLIR to be dangerous - how Triton uses MLIR passes to progressively lower IR
Improve model to reduce duplicates and improve training performance
Exploring a simple transformer model for sequence modelling in recommender systems
What happens when triton.compile is called in the frontend?
Missing tutorial on how triton program gets converted to cuda kernels under the hood
Benchmarking our own GPT2 model against Huggingface GPT2 model
Writing GPT2 from scratch and assigning weights from pre-trained Huggingface model