New blog post: why are we using LLMs as calculators? Mostly because we want to use all the software we write as calculators, and also because the end-goal is not calculators but "AGI". vickiboykis.com/2024/11/09/w... - ThreadSky

vickiboykis.com • 109 days ago

New blog post: why are we using LLMs as calculators? Mostly because we want to use all the software we write as calculators, and also because the end-goal is not calculators but "AGI". https://vickiboykis.com/2024/11/09/why-are-we-using-llms-as-calculators/

Comments

mistersql.bsky.social•109 days ago

I use my 4 function calculator as a word processor. As long as I stick to zoo, sizzle, logo, obsess, lol, eggshell, giggle, hello, hobbies, lego, ohio, lillies, boss, shoes or igloo, it meets my needs.

nicologiso.com•109 days ago

So opening the Python REPL to make calculations instead of using a calculator is not that weird after all!

gracekind.net•109 days ago

There could also be a selection bias where more attention is given to areas where LLMs are bad at something

petervandijck.bsky.social•109 days ago

It’s purely a research exercise, to be practical, give them an actual calculator function to use

geronimo-ai.bsky.social•109 days ago

exactly. we want them to be smart, creative and reason. if you want a calculator, use a calculator (and not an LLM)

vickiboykis.com•109 days ago

There’s a super interesting tension here where the use-cases that are the most monetizeable are those that require the most deterministic output (JSON, numbers, code etc) but that’s not why these models are really meant for… yet

petervandijck.bsky.social•109 days ago

JSON and code work really well though. Math on the other hand makes no sense to run through inference.

victorsvector.com•109 days ago

Had a casual conversation w/some coworkers about LLMs last Friday & exactly this came up

One who wasn't directly related to ML/AI asked why can't it do math well while computers are good at calculation. We went into tokenization, how models are trained & them working by sampling possible outputs

paloblanco.bsky.social•108 days ago

Thanks for sharing this!

Before I started programming, I inadvertently applied for a programming-heavy engineering role. They asked me to count the 0s and 1s in an arbitrary number, eg 10110. I was unfamiliar with binary or string conversation tricks at the time...

paloblanco.bsky.social•108 days ago

... and I stumbled along trying to iteratively divide by 10. Now that I am so familiar with basic programming, this problem is trivial.

I've seen various LLM plugins that let you access a Python interpreter. If we just gave these programs access to a calculator app and prompted it...

paloblanco.bsky.social•108 days ago

... To recognize these problems, is that cheating? If we give them access to enough baseline tools, will they "reason?"

Is this cheating? Will it be indistinguishable from reason if we trivialize it's ability to do basic math?

vickiboykis.com•108 days ago

Cheating in the sense of giving it more data to see if it can “reason”?

paloblanco.bsky.social•108 days ago

My understanding of your post is that we are interested in the LLM calculator problem cuz if it can"figure out" how to do that, then it is "reasoning" to some degree.

That sounds to me like a high bar to clear for a "lookup" oriented process. Can't we just give the LLM access to basic calculation?

paloblanco.bsky.social•108 days ago

That doesn't solve the reasoning problem of course. But given access to a critical mass of basic functionality... Calculator, vision, basic data analysis... Is reasoning the combination of these things or the synthesis of a whole new field? Must it reinvent a field from first principles?

paloblanco.bsky.social•108 days ago

Sorry if this is banging on about nothing, I'm home with a sleepy sick baby today 🤕

namer.bsky.social•109 days ago

The distinction, I feel, goes back to the very basics of discrete versus continuous. We use numbers to analyze (discrete) and language to communicate (continuous).

But LLMs don't distinguish between that. "Reasoning" is honestly too broad a term to describe the problem.

namer.bsky.social•109 days ago

So the only way it most LLMs do correct math is by having a network designed to learn discrete numerical rules specifically (the NN equivalent of hardcoding something) or by stumbling upon correct values by virtue of arithmetic present in training data sussed out via lots of prompt engineering.

vickiboykis.com•109 days ago

I love this distinction!

Comments

Posting Rules

Reply