In the future we will have custom kernels for new models hours after they come out. This will make inference so much better.

https://developer.nvidia.com/blog/automating-gpu-kernel-generation-with-deepseek-r1-and-inference-time-scaling/

Comments