folks, how would you fix this? - ThreadSky | a Reddit-style client for Bluesky

ThreadSky

About ThreadSky

neuralnoise.com • 41 days ago

folks, how would you fix this?

Comments

thatsjustlikeyouropinionman.com•41 days ago

A bunch of supervised fine tuning. But to truly generalize it requires a different approach than just a language model.

neuralnoise.com•41 days ago

Like what? Serious question

thatsjustlikeyouropinionman.com•41 days ago

Some kind of actual concept model capable of reasoning through the abstract visual features, i.e. “the minute hand is three ticks past 2 so that must be thirteen past the hour.” Basically an AGI level task TBH.

thatsjustlikeyouropinionman.com•41 days ago

you could cheat by showing it a bunch of examples of every possible time, but it’s not really understanding the concepts in that case, just pattern matching

thatsjustlikeyouropinionman.com•41 days ago

also, I think the current way vision adapters are hooked up to LLMs is inadequate. representing high density visual features in a linear embedding is problematic.

neuralnoise.com•41 days ago

I’m wondering if the model can CoT its way out of this

thatsjustlikeyouropinionman.com•41 days ago

I don’t think the vision adapter is good enough, spatially.

ekazakos.bsky.social•40 days ago

People have worked on this before: https://arxiv.org/abs/2111.09162

obinnaokpolu.bsky.social•41 days ago

It's amazing how the second one gives its answer as a series of deductions where nothing in the chain follows from what comes before.

Posting Rules

Be respectful to others
No spam or self-promotion
Stay on topic
Follow Bluesky's terms of service

Comments

Posting Rules

Reply