Some questions triggered by the release of DeepSeek R1 on January 20. These are formulated as questions, because I do not know the answers and it may well be that most of these answers are only things we can find out over time. - ThreadSky

dacemoglumit.bsky.social • 22 days ago

Some questions triggered by the release of DeepSeek R1 on January 20. These are formulated as questions, because I do not know the answers and it may well be that most of these answers are only things we can find out over time.

Comments

mybookstand.bsky.social•16 days ago

https://www.alternet.org/trump-institutions/?utm_source=Iterable&utm_medium=email&utm_campaign=Feb.9.2025_2.47pm

dacemoglumit.bsky.social•22 days ago

First, perhaps the most important question is this: does DeepSeek’s success mean that the US tech industry was approaching the problem the wrong way?
US AI investment is massive. Goldman Sachs estimates that the tech sector is set to spend $1 trillion: https://goldmansachs.com/insights/articles/will-the-1-trillion-of-generative-ai-investment-pay-off

dacemoglumit.bsky.social•22 days ago

For a long time, a number of commentators (including myself) have questioned the direction of AI investment and development in the US tech industry.

dacemoglumit.bsky.social•22 days ago

To the best of my understanding, all of the leading companies are following essentially the same playbook (with the small difference that Meta is partially open source).

dacemoglumit.bsky.social•22 days ago

These companies are unwilling to consider different approaches than foundation models pre-trained as next word predictors on massive data sets, and, for the most part, anything other than diffusion models and chatbots aimed at performing human tasks.

dacemoglumit.bsky.social•22 days ago

While DeepSeek is not reinventing the wheel and is broadly within the same agenda, it appears to have relied much more heavily on reinforcement learning and mixture-of-experts methods and refined chain-of-thought reasoning very effectively.

dacemoglumit.bsky.social•22 days ago

As widely reported, it has also done so at a fraction of the cost of the models of leading companies, about $5.5 million, as compared to sums running into hundreds of millions of dollars for the leading models.

daves1412.bsky.social•17 days ago

I agree that solely using this approach is unlikely to result in “AGI”, whatever that is. We think in images, symbols, all sorts of representations of the world and we use abstract concepts to try to understand it better. These models do appear to have a small amount of emergent understanding though

daves1412.bsky.social•17 days ago

There is a bit of a bubble, yes. Something important is IMO happening but its value is still unclear

daves1412.bsky.social•17 days ago

Not necessarily, but excess capital/tech can make you lazy

Comments

Posting Rules

Reply