DeepSeek released a whole family of inference-scaling / "reasoning" models today, including distilled variants based on Llama and Qwen
Here are my notes on the new models, plus how I ran DeepSeek-R1-Distill-Llama-8B on my Mac using Ollama and LLM
https://simonwillison.net/2025/Jan/20/deepseek-r1/
Here are my notes on the new models, plus how I ran DeepSeek-R1-Distill-Llama-8B on my Mac using Ollama and LLM
https://simonwillison.net/2025/Jan/20/deepseek-r1/
Comments
I haven't tried any non-stupid experiments yet though!
Have tested DeepSeekV3 and very excited by new releases. Really appreciated your thoughts.
Reading the
But jokes can be the next level for a reasoning model :)
The SVG prompt is interesting for showing how the model organizes its visual space.
ollama run h''f.co/unsloth/DeepSeek-R1-Distill-Llama-70B-GGUF:Q3_K_M
Using this model: https://huggingface.co/unsloth/DeepSeek-R1-Distill-Llama-70B-GGUF
I think I can organize a donation for a few months..
It's a 1.28GB page load: https://huggingface.co/spaces/webml-community/deepseek-r1-webgpu
https://gist.github.com/sandipb/c9646ac4c2cb7407705f597771d3c227
Any idea why DeepSeek tends to summarize the answer compared to the rest of the models? o1 seems to give the more detailed answer but other models have interesting variations
Could relate to licenses too, the Qwen license is Apache 2, the Llama "community license" is janky