Profile avatar
sir-deenicus.bsky.social
tinkering on intelligence amplification. there are only memories of stories; formed into the right shape, the stories can talk back.
140 posts 57 followers 62 following
Active Commenter
comment in response to post
Seems to me "downstream of" is a communicative intent for a stage between correlated with and caused by/
comment in response to post
Poker is a zero-sum 2 player game--no cooperation. This makes it easy in comparison, to the point that using a lookup table to play is the basic approach for fixed-limit. Generally, poker can't be analogized from because of the unique constrains available to poker, making CFR a viable approach
comment in response to post
It's extremely difficult to do correctly. Many inference under uncertainty algorithms are NP-hard, even approximations are np-hard if there are lots of complex dependencies being inferred/reasoned over.
comment in response to post
Is it within the realm of possibility that this might someday find its way into Cities skylines as a mod?
comment in response to post
Speaking of tangent bundles and walking, some games implicitly leverage/generate them in their approach to ensuring walking on non-flat surfaces is less janky/bouncy/jerky.
comment in response to post
------------ 221506678275824895551478330339758539061189773737774336433975313280763953351631546560459649255038791485329587902144485
comment in response to post
To me, fact that 13B had highest faithfulness means this is (perhaps in large part) an artifact of how easy the tasks were. At some loss threshold the model properly gains CoT ability and then relies on it according to task difficulty. Hence smaller and larger models being unreliable/unfaithful resp
comment in response to post
Contra: It'd be in some human language because those are the vectors that have the richest interactions/relations. While CoT is not completely faithful, it's strongly correlated--kinda like short-hand and key notes. Has to be 100% faithful for computations rendered in context and tool-use however.
comment in response to post
∙⟨λφ.φ∙⟨τ⟩∙⟨τ⟩⟩⟩⟩⟩⟩⟩⟩⟩⟩⟩⟩⟩⟩⟩
comment in response to post
Waterfalls can indeed freeze. This is probably real but with forced perspective (camera really close to the ice formation while the person is quite far resulting in an optical illusion that the frozen waterfall is colossally large). Searching suggests it's the Goriuda Waterfall.
comment in response to post
I daresay even the cadence of .NET libraries themselves is slower. I think it's because LLMs occupy such a huge chunk of the field's attention. (One benefit of LLMs is they benefit smaller communities because each individual is also enhanced to do more, more easily)
comment in response to post
Right, if you look at Scala, similar things are being said there too. And if F# is dead then what is OCaml? Even Haskell, Racket, Elm, Clojure--none of them hold the mindshare they used to. Which is fine IMO. People are still building interesting things in them, just fewer corporates.
comment in response to post
And for emphasis: A human mind being computable means that in principle, it too could run as a computer program on a computer. An authentic such simulation would also be bad at calculation. Similarly in LLMs, running on a computer doesn't mean the mind-like thing will find all computer things easy.
comment in response to post
The handicapping of Claude in Copilot is no doubt a side-effect of the hamfisted, crude and clumsy prompt instructions given to it to discourage and dissuade users from having random conversations with it. Naturally, there's a way around that though.
comment in response to post
A Neural net is complexity bound, code data and execution evn rolled in one. If we wish to modify the model or otherwise adapt it, having the weights is what is important. The simple fact is that the concepts of opensource do not transfer cleanly, no matter how much people try to twist it to be so.
comment in response to post
That person does not know what they're talking about alas, but it's a common misconception. The neural net is in fact the code and data; its bulk is exactly a program written as a set of arithmetic expressions joined by some if-then comparisons to zero. That source was never human comprehensible
comment in response to post
No, not quite on Claude. There's a 90% or some other high chance that free Claude users are told that Anthropic is currently under high load so please use Haiku 3.5 in concise mode. Which is pretty bad.
comment in response to post
It most certainly is not. My experience is that it is decidedly worse. Note that I am not saying that I am right, but that your objective tone is wrong.
comment in response to post
The tokens themselves are like temporary auxiliary states, anchor points, like notes on scratch paper to constrain and control subsequent generation but not the complete state for the "thoughts" themselves.
comment in response to post
A theoretical possibility is the model also does not count output reasoning tokens as part of its internal processing. If we think about it, compared to the cached attentional hidden states (non-lingual), the notes we see make up but a small fraction of information processed per token.
comment in response to post
I think I know what's happening. The most mundane aspect is with each of your turns, the previous turn's thinking tokens are not passed to the model, for space saving reasons. So it is correctly reporting a lack of observable thinking tokens for it *at that point*.
comment in response to post
I don't like that framing. Doesn't it seem patronizing? It's painting those that are neither terrible nor exceptional as weak. We should like to live in a world that makes it so as many as possible are comfortable enough to readily do the right thing. Seems anger is turned in the wrong direction?
comment in response to post
It's not necessarily the model is not smart enough--curious what you get when you try with QwQ. Evidence so far that distillation does not result in authentic reasoning capability.
comment in response to post
Interesting. I read the webcomic years ago, it was fun but a typical power(growth) fantasy. I haven't seen either season--what are you enjoying about the anime?
comment in response to post
The oddest thing to me is the idea that dynamical and computational systems are somehow separate. But all computational systems (esp ones that are not deterministic) are also subsets of dynamical systems--namely those that always work with finite information and precision (time steps, divisibility)
comment in response to post
The focus on such sparse MoE's and the tweaks done to it that differs from the typical MoE show a clear hint at experimentation among choices that led there.
comment in response to post
What does that mean exactly? These are clearly deliberate choices made for a compute constrained environment. Proving out mixed precision with fp8 in a production evn is by itself a huge deal. The load balancing innovations and the approach to cheap attention. These are all precisely done and clever
comment in response to post
There's a paper you all can check yourself if you wish to disabuse yourself of the misinformation you've just been fed by someone who should really have known better. Disappointing to see, really. I wonder how many will first think, but who are you anyways to say this, instead of, are they right?
comment in response to post
They also performed specializing optimizations to SW environment. These together are enough to make it an OOM more energy efficient to run vs llama405B, long as you have sufficient memory. As for aspects affecting training, you can ballpark the math, the costs seem on the lower end of reasonable.
comment in response to post
Reading the paper instead of guessing from a position of ignorance is better, no? Here are innovations that make it significantly more efficient: - mixed prec fp8 training - MoE with a unique arch and high sparsity (+only ~40B act params) - load balancing innovations - low rank decmop attention
comment in response to post
Why should I trust it less than Microsoft or OpenAI? (which tbc, is saying you should not trust any of them! Whatever you send to any should be something you don't care they gain access to)
comment in response to post
The more such trivia it has fairly to perfect accurate knowledge on, the more confident we can get on expansiveness of its input data. Similarly, by looking at its low rank attn, the fact that they used mixed prec fp8, MoE's and innovated on load balancing--the plausibility of the number's not low
comment in response to post
You can do better than that FWIW. You can probe it with long tail questions to get an idea of how diverse and extensivea its training data was. For example, when I ask it questions about The Anama of the Isle of bigail from an obscure indie RPG, it knows about it!
comment in response to post
Hmm, but prompting skill will remain significant. See, it's not due only to a failing of the model but a difference too. As LLM and human minds aren't identical, there'll remain a major need to know how to phrase things for LLMs specifically, even with question asking skill overlapping non-trivially
comment in response to post
I guess Rogue trader is close, but WH40K is basically a fantasy setting--although, if Starwars counts as scifi then WH40K can too? Colony Ship comes closest maybe, space aspect is background only (does shape/frame story tbf) Games with space heavy aspects and some kind of story most often strategy
comment in response to post
A rare combo. Everything I can think of misses an important component. Outer *Wilds*, Rogue Trader, Children of a Dead Earth, Terra Invicta, Cyperpunk2077, DeusEx Human Revolution. Each is either a hard-scifi (or close) RPG but not space, hard scifi space game but not RPG or plain scifi RPG, etc.
comment in response to post
This is why I mentioned the half-life of papers being terrible overall. Being peer reviewed is almost no signal; for some fields, it's no signal at all. I like to think most humans are good. So the problem here is the incentive structure that leads to this kind of sickness and to such abuses.
comment in response to post
I wasn't talking about what is better or worse, I meant the kind of people who have always engaged in scummy tactics are adapting to new technology. Saying it's a few papers also massively understates the problem. Abuse of stats, replicability are major issues. Not all of it has to be flagrant.
comment in response to post
Main error here is in not placing an upperbound *on how many humans*. Being more capable than any given human at most or all intellectual tasks does not equate with being better than elite teams and groups across all tasks. The second issue is expertise to initiate, guide and verify the jobs.
comment in response to post
Because there is more total creativity and skill outside a company, no matter its size, than inside it. No matter how elite.
comment in response to post
A very happy unbirthday to you, then!
comment in response to post
Review is by Cosma Shalizi. Genetic algorithms (for rule discovery) and bucket brigade algorithm for local credit assignment were key in Holland's PI. --- I believe RTRL for RNNs should also be relevant.
comment in response to post
Decision transformers are an LLM adjacent instance of control as inference. --- For other approaches, SOAR was mentioned; ACT-R is also relevant and PI (adaptive, message-passing, inductive rule system) predates both. Has a really enjoyable book, Induction, on it; review: bactra.org/reviews/hhnt...
comment in response to post
What I mean about your definition is that it is so open, it'd be meaningful only to someone who already knew the actual definition. One can also talk about strategies as functions from information sets (ie "state") to actions. I don't know why RL experts like to forcefully subsume everything into RL
comment in response to post
That's includes so many things as to be a non-definition. Besides, information sets (non-identifiability as a matter of indistinguishability) and POMDP states (uncertainty from incomplete observations on state) are not 1 to 1.
comment in response to post
Mmm that's not quite what I'm referring to. I'm pushing back against the idea that policies (the technical and not some vague nebulous umbrella term) are the only way to characterize action selection at some decision point.
comment in response to post
The control as inference setting is arguably more natural than RL. One can comfortably situate, RL, active inference and control theory within this framework. And--I think--there is an argument to be made that the parts of RL and control that do not quite map are also un-natural.
comment in response to post
It depends on the action space or what is meant by agent. For certain actors in game-like settings say, the concept of a strategy (profile) is more natural and is not quite the same as a policy.