I think one of the reasons that I like Rust, and why I've been so reluctant to really use LLMs at all comes from a core tenet in building safety critical systems: When it really matters, it is better for a tool to say "I don't know", give up, and to make it clear it is doing so, than to guess wrong - ThreadSky

jamesmunns.com • 20 days ago

I think one of the reasons that I like Rust, and why I've been so reluctant to really use LLMs at all comes from a core tenet in building safety critical systems:

When it really matters, it is better for a tool to say "I don't know", give up, and to make it clear it is doing so, than to guess wrong

Comments

againstfacists.bsky.social•15 days ago

Especially if the tool is a person

thoth.ptnote.dev•20 days ago

I don't use AI in development, but I can think of two purposes for which it could be useful.

1) Common questions about popular tools. If google couldn't find it right away, taking 5 seconds to ask the LLM before going to Discord or SO can't hurt, so long as the answer is verifiable.

thoth.ptnote.dev•20 days ago

2) LLM Rubber Ducks. If explaining your problem to a literal rubber duck would help, so might using an LLM. Even if the answer you get back is wrong, it might help you think about the problem differently.

alilleybrinker.com•20 days ago

I know someone who has worked with LLMs trained specifically on explaining the output of a complex tool and making recommendations, and it works really well when tailored for that. Especially since at the end you can verify if the fixes work or not.

annun-thurunen.bsky.social•20 days ago

They also just seem to have been trained on too little embedded stuff to be helpful. I try them again every couple of months for code and I keep getting very confidently wrong results. Inventing zephyr APIs, using existing APIs wrong, etc. And indeed having tools that say no is beyond useful.

jamesmunns.com•20 days ago

We have pilots, or developers, in the loop on purpose. They are capable of real reasoning, and in most cases, they have been trained in the area they are operating in.

They have limited context. It's better to know when you can't rely on your tool, than to think it is good when it isn't.

jamesmunns.com•20 days ago

Your compiler, or piece of avionics, has access to a lot more data in some aspects, and access to a lot less data in other aspects.

It's not easy to push that in either direction.

I'd rather my compiler say "abiguous, rejected", or my avionics say "Fault, inoperative" when they aren't sure.

jamesmunns.com•20 days ago

Planes and compilers are not the same thing. You have to be really careful in a plane not to DISTRACT or annoy the pilot: they are under other pressures, and even your worst failure is only one of the things they have to juggle.

Software development is relatively much more relaxed in time.

jamesmunns.com•20 days ago

But still: you don't want to *mislead* the developer either. You want them to TRUST the tool, and that only happens when you know that it will either give you the RIGHT answer, or a clear "I don't understand".

I know folks who work on diagnostics work VERY hard to avoid misleading ones.

proto.cool•20 days ago

This point specifically stuck out at me; I do use LLMs, out of interest in what's hot in tech, and this is something I personally run into; the output is frequently *not* trustworthy.

It has generated working code, but when it doesn't, you have to fight it, and be very clear in your instructions.

proto.cool•20 days ago

Even then, sometimes, it will carry on with a brash confidence and output something that is not adhering to the rules you put in place, or even remotely attempting to solve the problem you described. And that's a big problem!

jamesmunns.com•20 days ago

You WANT tools to feel like an extension of your self, mentally and physically.

Maybe LLMs work for other people, but I can't shake the feeling that creeping doubt in the back of my mind about the quality of what it is doing, would drain my productivity.

That might just be me, but 🤷

jamesmunns.com•20 days ago

If it works for you: I'm not here to tell you to stop.

I think a lot of the efficiency issues will be addressed in time.

Provenance of training data might get fixed at some point, I doubt any time soon though.

I think treating statistical tools the same as deterministic tools is always a mistake.

baileytownsend.dev•20 days ago

I use LLMs daily in coding. And you’re right: When they are wrong, they can be extremely confidently wrong. I’m talking like made up properties and method names, and it’s 100% thinking it’s correct lol. That can be dangerous to devs, especially new comers.

baileytownsend.dev•20 days ago

My use case for them is mostly it gives you a start when using a new
library you’re not familiar with. I need to play around with running some locally. If I had one that I could feed an example directory and docs to then it just uses that as reference that be perfect

jamesmunns.com•20 days ago

Yeah, I think they can be used well! I use google as well, and people on the internet can be wrong too!

I just worry about how they are marketed, and how people expect to use them in their workflows.

cookting.bsky.social•20 days ago

Great point, but I'd like to add: LLMs are always operating correctly when they generate bullshit; the result is just a fuzzy ststistical averaging of tokens. The grift is in using terms like "reasoning" and "hallucination" and even "AI" to describe what it's doing.

dacut.bsky.social•20 days ago

This was one of the failures in the Three Mile Island disaster. An engineer decided that pressure readings above a certain value didn’t make sense, so readouts were capped to that value.

This limit was hit during the incident, and operators thought it had leveled off.

fry69.dev•20 days ago

So many problems arise from not understanding how LLMs work IMHO.

If you have some time to spare, I'd highly recommend to watch this 3 1/2 hour explainer from Andrey Karpathy (filled with many details few know probably about modern LLMs and how they get built) ->

Comments

Posting Rules

Reply