I'd been waiting for a WebGPU LLM inference engine.

https://github.com/mlc-ai/web-llm

Comments