I need a really smart tech person to tell me why we should *really believe Deep Seek did what it said it did. - ThreadSky

kairyssdal.bsky.social • 35 days ago

I need a really smart tech person to tell me why we should *really believe Deep Seek did what it said it did.

Comments

It's an open source model so you can check out results at least. Hard to say on inputs if it was able to use only what they said they did but see other comment about Stanford getting similar results. China also probably doesn't mind training on US copyrighted data

mandoqueer.itch.io•34 days ago

neither do the US AI companies so that sounds immaterial

shirleman.bsky.social•34 days ago

Incorrect. They have pretend and hide through obfuscation. They have to make it _seem_ like they didn't use that information. I'm not being facetious or pedantic, it really is harder to do that

mandoqueer.itch.io•34 days ago

hm, fair enough

danieldelaweb.bsky.social•34 days ago

No truly intelligent person will tell you the truth, especially if they understand technology, politics, business, or anything else. The future will be cyber wars to position countries, regions, and obtain high-level information. You must study and reason on your own and you will discover the truth.

sbyrnes.bsky.social•35 days ago

Not long ago, Stanford trained a similar model called Alpaca on only 52k examples created by GPT. They proved you can use the larger models to train a smaller model with comparable performance, which is likely what they did.

https://crfm.stanford.edu/2023/03/13/alpaca.html

sbyrnes.bsky.social•34 days ago

Based on the DeepSeek paper, this is roughly what they did. Their big innovation was having the model reach itself by repeatedly trying to answer the same question and evaluate the responses.

I suspect we'll see more models built this way soon!

matte-gender.bsky.social•34 days ago

apple also has been working on similar things, allowing running models direct off storage for a pretty substantial efficiency increase. LLMs from MS and OpenAI and google simply haven’t been optimized

davidlombino.bsky.social•34 days ago

Here's a good overview of the current AI market which also, towards the end of the video, outlines potential reasons to be skeptical of Deepseek's claims:

https://youtu.be/GqcCvvFZsi4?si=pblcYt_Ih4LsnuXD

djcapelis.bsky.social•34 days ago

Not specifically an AIML expert but an expert in other areas of computer science and some of this stuff interacts with my work.

Here’s where I’m at on why it seems credible so far:

djcapelis.bsky.social•34 days ago

The claims that are easily verified can be and the rest is in the process and fairly credible. There’s always a chance at something fishy but it’s not the most likely explanation. This is a field in which people make untrustworthy claims often so evaluation is part of the process.

djcapelis.bsky.social•34 days ago

But perhaps the more important piece: which part are you most concerned with verifying? The performance of the model or the cost and way they describe on how they got it? Most verification has indeed focused on the former so far. The latter is harder to verify.

kgcab.bsky.social•33 days ago

But the cost of model generation is the big claim, no?

djcapelis.bsky.social•33 days ago

Yes and while it’s hard to verify concretely right away I think people are assuming this part seems fairly credible in part because:
* It’s a new organization and the budget is about right
* Based in an environment with restrictions which seem likely to spawn stuff like this

bubbanyarlathotep.bsky.social•34 days ago

Thank you. Someone arguing for rational thought, god forbid, instead of knee-jerk reactions.

themaltesemama.bsky.social•35 days ago

also good info from this thread https://bsky.app/profile/karenhao.bsky.social/post/3lgq4pcanhk2v

themaltesemama.bsky.social•35 days ago

also @edzitron.com has good info.

themaltesemama.bsky.social•35 days ago

do you follow @anildash.com ?

jermany.bsky.social•34 days ago

Bingo.

stef47.bsky.social•34 days ago

ulisesr.bsky.social•34 days ago

We should believe them because they cited legitimate sources and produced an academic paper that stands up to the rigor of SOME testing

parrfamily.org•35 days ago

AI’s Sputnik moment.

adamkuphaldt.bsky.social•35 days ago

This is way overkill but there is a huge chunk of verifiable technological achievements in this model.

https://epoch.ai/gradient-updates/how-has-deepseek-improved-the-transformer-architecture

blackholefun.bsky.social•35 days ago

Thank you! This is exactly what I've been looking for.

kgcab.bsky.social•33 days ago

These are improvements to query result generation, not the claimed model generation improvements.

seanthebear.bsky.social•34 days ago

Not smart lol but the evaluations/tests prove it.

Deepseek was tested in comparison to other models and equalized/won against them in most tasks.

governorkilby.us•33 days ago

You can literally go download the model and compare it to other models.
https://huggingface.co/deepseek-ai/DeepSeek-R1-Zero
If DeepSeek was not close to the reports (a la propaganda), the community would be able to immediately refute it.
The stock market did what it did, because, no refutation.

codehappy.bsky.social•34 days ago

they tell you in their technical reports what they did. the compute figures add up. the performance of the model is indisputable; you can download it and run it yourself. there are _many_ replications in progress; the RL process described works for learning CoT/self-reflection even on small models

nonabonita.bsky.social•34 days ago

Do you know who is independently verifying those tech reports? Sorry, just starting my research.

codehappy.bsky.social•34 days ago

here's an r1 replication being done by HuggingFace, for example.

you can be sure most tech companies and lots of individuals are doing this also, the kind of post-training they claim to have done is cheap (& a much bigger deal than the pretrain cost)

beware, there will be lots of FUD either way

nonabonita.bsky.social•34 days ago

Thank you!

brooklynkid53bskys.bsky.social•34 days ago

Kai

by chance, I can actually answer this for you with what appears to be an objective test:

https://www.nytimes.com/2025/01/23/technology/ai-test-humanitys-last-exam.html?searchResultPosition=1

nickfreed.bsky.social•35 days ago

https://bsky.app/profile/karenhao.bsky.social/post/3lgq4pcanhk2v

sfjenifer.bsky.social•35 days ago

Thank you.

dipshitdipshit.bsky.social•35 days ago

If you get a good answer, it'll be the first one

michaelruminer.bsky.social•34 days ago

Not the rigor of actual scientific review but it's my understanding that since it is open source HuggingFace is working to replicate the results.

mark994w.bsky.social•34 days ago

Word

huntgbj.bsky.social•34 days ago

Deep Musk.

soniaromaih.bsky.social•34 days ago

I saw this today ... I haven't read it yet, though.

https://www.theguardian.com/business/2025/jan/27/what-is-deepseek-and-why-did-us-tech-stocks-fall

zendarva.bsky.social•35 days ago

Here's my take.

There's no reason i can come up with to believe it didn't.

The claims are plausible in light of other performance.

publiuscirca24.bsky.social•34 days ago

it's open source...meaning you can disect its operations to it's foundations...now - of course they built on already pre-existing hardware - but everyone does....it's what ChatGPT was supposed to be....open source....ill caveat that with it might be open source but the CCP still controls the owner!

nwcorner65.bsky.social•34 days ago

Me too.

gregoryheller.bsky.social•34 days ago

This 100% How can we reason with an electorate and public that is all ethos as pathos and no logos? We should be very concerned if/when the prices really go down because it will probably signal a recession more than anything else.

stevehuntsman.bsky.social•34 days ago

Because that would be way smarter than spending way more money and then making a huge profit by shorting NVDA

Oh wait nvm

marcofski.bsky.social•34 days ago

Given the ongoing AI competition between the U.S. and China, there may be incentives to exaggerate capabilities or downplay costs for strategic reasons.

seashoreflora.bsky.social•34 days ago

I am not an expert but DeepSeek is open source and their distilled models can be downloaded and run locally; I've seen people claiming to have run it on their own Mac Pros

trevorisjoking.bsky.social•35 days ago

It’s open source so the data is public

manderville-man.bsky.social•35 days ago

If I can convince you with my techno babble, then I'd also encourage you to dump your entire retirement into a once-in-a-lifetime opportunity called $MANDERCOIN. In fact, just wire me the money directly, it'll be easier.

tocatchathief.bsky.social•34 days ago

Ask Grok 😜😁

marymaggie.bsky.social•35 days ago

I have deja vu to all the hype on Y2K. Lots of money was spent on that - best con since santa.

tobellz.bsky.social•35 days ago

My rubric is that anyone hyping their "AI" product is lying.

xxjammerxx.bsky.social•34 days ago

it's open source, dumbass

fatemehx2.bsky.social•35 days ago

You need @anildash.com. In fact, please get him on the show to explain why this is newsworthy to the average non-tech listener.

johnblaska.com•35 days ago

Plug this into each and they offer a decent explanation. A = system being questioned, B = competitor.

Help me better understand the pros and cons of A vs B. When is one or the other best used? What about each system allows for better use?

Deepseek was faster. OpenAI offered more information.

johnblaska.com•35 days ago

Continue the interview via both you will gain some good insight.

ejgoldstick.bsky.social•35 days ago

Perhaps you should pitch a "reality" show to PBS entitled "The Dueling AIs" and ask them all to put up or shut up in public and with a relatively disengaged if skeptical "jury"...

ceo-kehoe.bsky.social•34 days ago

I literally thought the same thing. This is exactly like when they had "hypersonic" missiles.

pspicerwensley.bsky.social•34 days ago

It’s open source. People can look at the code. Also people can install the model locally on a variety of devices and are getting good performance given the hardware limitations. Keeping data local and focused is appealing.

openrisk.eu•35 days ago

They may have polished some corners of how they got to the end model but if you combine i) the transparency of open source, ii) the intense global interest by other experts and iii) the consequences for a Chinese team (in particular) if debunked...

You connect the dots.

mainsteph.bsky.social•34 days ago

Boy! Do I get that!

ericb013.bsky.social•35 days ago

"ChatGPT, why should we believe the DeepSeek press release?"

huntnjade.bsky.social•34 days ago

I need someone to tell me why we should care. Is this just a thing to think for us? WTF? I don't get the excitement around these things.

singletrack.bsky.social•35 days ago

I haven’t read up on it, but since they open sourced the training process and not just the resulting model it should be falsifiable.

sdhoigt.bsky.social•34 days ago

Before that I would set a mental baseline for how reliable Deepseek is by asking it for criticisms of the CCP.

Spoiler: it’s a glowingly positive review.

29aperture.bsky.social•35 days ago

Wired seems to be a trustworthy source of late, anyway.

https://www.wired.com/story/deepseek-china-model-ai/

jermany.bsky.social•34 days ago

This is all second hand and the word of Liang Wenfeng transcripts…

emilong.bsky.social•35 days ago

Independently verifiable aspects:
- model is open source
- paper describes their methodology
- limits on gpus
- they charge less

The price they paid for research is the only thing we have to take their word on fwict

grantwood.doowtnarg.com•34 days ago

Partly because you can download the model yourself and play with it. Training cost are somewhat correlated to runtime costs, and people are showing great perf on very small hardware. A decent number of people played with it over the weekend and the results are very good.
https://github.com/deepseek-ai/DeepSeek-R1

Comments

Posting Rules

Reply