I'm not deep into the subject matter but the premise of DeepSeek (but unverifiable) is that it was done faster and with less budget than OpenAI and other competitors. That sounds as if they have a few (non open source?) tricks up their sleeves. - ThreadSky

About ThreadSky

webguy.bsky.social • 42 days ago

I'm not deep into the subject matter but the premise of DeepSeek (but unverifiable) is that it was done faster and with less budget than OpenAI and other competitors. That sounds as if they have a few (non open source?) tricks up their sleeves.

Comments

peterszasz.com•42 days ago

As far as I understand instead of hunan-reinforced learning, they used existing LLMs to validate the training of their own. Definitely sounds like a trick, definitely nothing to do with OSS.

paulmwatson.com•41 days ago

And building on the shoulders of giants. It might be commoditized but there's a lot going on in this advancement that isn't so easily portrayed as Team B humiliates Team A.

webguy.bsky.social•42 days ago

As training cost comes down due to tech advances, it might be more and more reachable for smaller entities

paulmwatson.com•42 days ago

DeepSeek's claims are unverified & not indicative of the total system cost. Training costs coming down is debatable. Digital technology has a habit of shifting the goal posts negating efficiency gains. o1/R1 will be quaint in a few months time with new systems requiring more not less resources.

paulmwatson.com•41 days ago

Much the same happened in the cryptocurrency/blockchain world (which the DeepSeek team came from, a lot of the LLM world is cryptoadjacent.) Much was promised, breakthroughs were trumpeted, not much changed from the initial offering.

badmedicine.bsky.social•42 days ago

Also, the DeepSeek guys programmed 20 out of 132 processing units in their memory bandwidth constrained H800 chips to manage cross-chip communications with PTX.

A truly insane level of optimization

paulmwatson.com•42 days ago

That's way above my pay grade. This was to get around export control imposed limitations on H800s?

badmedicine.bsky.social•41 days ago

I'm not sure if that alone was the reason, there are other finer optimizations they've done

Posting Rules

Be respectful to others
No spam or self-promotion
Stay on topic
Follow Bluesky's terms of service

Comments

Posting Rules

Reply