I bet something like WebLLM becomes a standardized web API soonish.

Chips are already getting really good at local inference.

https://github.com/mlc-ai/web-llm #ai

Comments