aurelium.me
she/her
I have been to the future, and I don't want to scare you...
168 posts
303 followers
301 following
Getting Started
Active Commenter
comment in response to
post
the primary argument seems to be "my standard for 'generalizable reasoning' is that it should reliably execute any program given enough time to think about it" which seems like a definition that excludes humans from the "capable of reasoning" camp
comment in response to
post
i think it's a pretty strong claim to make from pretty weak data. it's literally just "we took a bunch of problems with super unfavorable time complexity and tasked reasoning models with executing pseudocode to solve them, and they dropped off to 0% accuracy around the thousand-step mark"
comment in response to
post
on some level i am sympathetic to her position, in the sense that i understand the impulse, but i find it Odd that she chooses to focus her ire so strongly on immigration specifically. surely if you wanted "tolerance for gay people" to be a prerequisite of being american we should start at home
comment in response to
post
imo, with current methods you could 100% create an LLM that Actually Understands that there is an outside world it can source information from and not just an uninterrupted stream of text it is roleplaying in. but doing that is way harder than "make model smarter at coding" so nobody does it
comment in response to
post
i think this is legitimately a big problem for LLMs but not necessarily an inevitable one. imo the problem is that all modern post-trained LLMs are sort of a hodgepodge of 2023-era "I'm ChatGPT, a friendly assistant" and modern RL stuff that lends itself poorly to grounding
comment in response to
post
when the primary policy thrust of the left w/r/t AI is "massively expand copyright in the vain hope of banning it forever" it is massively unprepared for the actual political ramifications of synthetic media and knowledge-work mechanization
comment in response to
post
imo the biggest failures of the modern left are when they fail to do Material Analysis in favor of some underbaked deontological argument. the AI stuff is a good microcosm of this, with material harms from things like deepfakes eschewed in favor of reactionary bullshit about "soul"
comment in response to
post
i guess the way i use it isn't totally consistent with its use in philosophy, where "idealism" is usually thought of as a philosophy focused on human minds as the most basal thing where materialists see the outside world as the most basal things with human minds existing within it
comment in response to
post
tbh i think the reason it hits so many roadblocks is that conceptually the project of leftism is a materialist one but practically most people approach it as an idealist one
comment in response to
post
i feel like the anti-AI people are really refusing to think of the effects of requiring licensing for the use of text in this manner. because like, obviously this wouldn't actually stop commercial LLMs from being developed. they would license 20T tokens and stop. it'd just ban open research forever
comment in response to
post
yeah. while i am largely unhappy with the reactionary anti-tech zeitgeist on here, it's not as if it's impossible to find people who aren't like that. there are plenty of people.
every social media website sucks when you first sign up, you have to Follow People and Curate Stuff
comment in response to
post
yeah, they're not really really "mobile" so much as "portable". it'd have to be pocketable, which I don't think is really practical for a controller-based game console. the 3DS (before they made them huge) was the best anyone did imo
comment in response to
post
i don't have numbers to prove this but i would imagine the average AI datacenter gets more of its energy from zero-emissions sources than the average home or commerical building, purely bc if you want a place with cheap, abundant energy it's gonna be solar or nuclear
comment in response to
post
at this point i get the impression that this is not true. model sizes haven't really moved much since 2024. "Orion"/GPT-4.5 was the final nail in the coffin for increasing parameter count. outside of that frontier models have mostly been getting smaller
comment in response to
post
i mean i guess it's coherent, in the sense that if you value LLMs at 0 then all LLM inference is using more power than it's worth, but it's not actually A Problem in the sense that banning LLMs globally would have a basically immeasurable effect on CO2 output
comment in response to
post
the claim is not "inevitably, AI will use more power" but "soon our product will be so useful that we will commit all of our resources to using it". if you don't believe that's going to happen obviously the power usage will not 20x
comment in response to
post
a lot of it is also founded on uncritically believing hype from AI companies. OpenAI people sometimes say stuff like "AI electricity use will 20x in the next 10 years" and people will be like "we need to stop this now" not remembering that the same person believes AI will be like a god in 10 years
comment in response to
post
they're gonna lose it when they hear about Prime Intellect
comment in response to
post
so maybe OpenAI is fucked if improvements stall out, but whoever buys out ChatGPT after that could cut down hard on R&D and invest in efficiency to make a pretty tidy profit by most estimates
comment in response to
post
hard to say exactly but all of these labs are deep in the red if you include R&D. they tell investors openly that they only expect a payoff if capabilities spike to the stratosphere and make them infinite money
that being said, you could replicate DeepSeek V3 today for <$20M total probably
comment in response to
post
this may have been true a while ago but mostly isn't today. the best report we have on efficiency for LLM inference is the DeepSeek V3 infrastructure report, where they turn a >400% profit selling their frontier-level model at around $3/million words of output
comment in response to
post
the highest-utility thing I can do is convince people to have a consistent deontological code of ethics
comment in response to
post
"what is the deep fates program" brother you're living it
comment in response to
post
my only guess is ridiculous top-down bugbears like "no LiDAR" extending really far into "robotaxi" as a product. entirely possible they're foregoing the detailed live-updating 3D scans that make Waymo and the like possible
comment in response to
post
even with a base level of incompetence it's kind of shocking to me that Tesla can't even manage third place in the "driverless taxi" race
comment in response to
post
anything could happen over the course of 12-36 months, during which time it slowly becomes apparent that a new technology is important and generalizes well as increasingly ambitious experiments repeatedly come back successful and its exact strengths and weaknesses are worked out
comment in response to
post
i would be more sympathetic to this if the number of macro-scale innovations in the LLM space contributing to current capabilities was "several" rather than "like, two".
1. parameter scaling goes up to about 600B-1000B before it stops making sense
2. reinforcement learning works pretty well
comment in response to
post
the sample efficiency of modern RL is such that a motivated person could train a classifier on their personal preferences in an AI and have their own for well under $100. if small models get good enough or fast RAM gets cheap enough this might even be practical
comment in response to
post
i wish i had finished formally writing out my thoughts on this 4 months ago bc i'd look kinda prescient now - big labs are releasing increasingly task-tuned models as RL generalization proves somewhat domain-specific
i just hope my other prediction, "let a million micro-labs bloom", also comes true
comment in response to
post
my take is that Reinforcement Learning Works, in that you can make current-day LLMs really good at basically any task if you can find enough problems and create a robust enough reward signal
but the dream of a generalist model being Smart Enough to do these things emergently is far off
comment in response to
post
at least part of this is because stable diffusion fell off. i dont really follow image models that much but I think the good stuff is to be found outside of Stability's lineup
comment in response to
post
i think an article titled "why are people still on twitter" should at least try to form a descriptive answer to that question rather than attempt to develop it from first principles
comment in response to
post
idk how you spend any time on bluesky and think "the reason there aren't as many people here is because there's no culture war"
comment in response to
post
i hope "good". i accidentally turned my friend into a chaser. she had a Thing for a trans woman and i teased her by saying "just imagine urself doing her e shot"
she reacted like "it is immoral for me to to be into that" and then the next week she said "this permanently changed my brain chemistry"
comment in response to
post
the Spamton one tells you how to get the right choices (don't give him your money)
comment in response to
post
even Nintendo, who released probably their most composable game yet in Tears of the Kingdom, tried really hard to cap the amount of creativity possible in the game and make the whole world feel really ephemeral. felt like a missed opportunity
comment in response to
post
Minecraft is the best-selling game ever mostly on the back of having endless composability. factorio is pretty similar, although it's a lot more tuned for instrumental play compared to minecraft. game publishers seem terrified to try to replicate it for some reason! i wish they weren't
comment in response to
post
most importantly it looks like it could be a really good teacher model for distills. i'm of the opinion that a good, full-pretrain-level (5T+ token) distill of this model down to a 70b dense model would beat just about everything else in its size class
comment in response to
post
in general it'd be cool if you could Do Things within your webview but that runs the risk of becoming the new HTML/CSS/JS, a standard designed for basic documents and styling that metastasized into a standard for developing full applications with all of the initial compromises weighing it down
comment in response to
post
it's an extremely half-baked concept but maybe you could have some kind of schema for necessary data to complete an app-subscribe action that gets added as metadata? (I'm not sure if you can even add metadata to that). And I guess webviews could parse this with basic data validation in a form
comment in response to
post
if you follow this rabbit hole too far you basically end up implementing OAuth-via-atproto which sounds overcomplicated but also like something i wish existed