Profile avatar
datageek-pl.bsky.social
AI codegens / agents + applied AI / CS / tech insights 🎯 target shooter, sci-fi geek 🚀 techno-optimist
50 posts 130 followers 42 following
Regular Contributor
Active Commenter

Concordia is a library for generative agent-based modeling that works like a table-top role-playing game. It's open source and model agnostic. Try it today! github.com/google-deepm...

Still cooking my aibricks project. Simple example of what I mean by "configuration driven":

From the same people (lmarena.ai) who brought you Chatbot Arena, they are introducing WebDev Arena. Leaderboard: web.lmarena.ai/leaderboard

LEXICO defining the Pareto frontier of KV cache compression

Textual 1.0 has been released. 🥳 Three years in the making. A TUI framework that is bigger than the terminal. To celebrate, I want to give away some trade secrets. Because I am appalling at keeping secrets. Tell me what you think of the diagrams... textual.textualize.io/blog/2024/12...

1000x inference cost reduction by converting qwen/llama to rwkv architecture without retraining from scratch. Big if true/without-any-major-issues. Definitely worth checking. huggingface.co/recursal/QRW...

Q-RWKV-6 32B Instruct Preview substack.recursal.ai/p/q-rwkv-6-3...

developers.googleblog.com/en/the-next-...

In case you didn't know, Bluesky has built-in RSS support 👀 openrss.org/blog/bluesky...

The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities arxiv.org/abs/2411.04986

entrapix — your LLM should raise a ConfusedAgentError when it doesn’t know Why? Because as an application developer, I have a lot of things I can do to give the LLM more information github.com/tkellogg/oll...

Leaked Windsurf #codegen prompt: www.reddit.com/r/LocalLLaMA...

Please don’t mind any unfollow from my side - I’m experimenting with the atproto api.

This is how happiness looks like.

Now cooking the config layer and streaming. Middleware for streaming might require some trial and error but hey, thats the fun part! The results might be nice - imagine streaming but line by line instead by random number of tokens - the latency is still low and you can easily act on produced text!

This 1h free course is an excellent intro to memgpt like agent architecture. No frameworks - just doing fun things from scratch. I highly recommend this one even to people not interested in AI powered games - agentic internals are very similar everywhere. www.deeplearning.ai/short-course...

This is crazy, run all workflows for generative AI on local GPU from one simple to use interface, includes snap-ins for all the popular workflows. pinokio.computer #ai #aiTools #aiArtist

Twitter's algo has made us forget the value of links in short messages from people who share our interests. To say that I’m in tears would be an overstatement but I’m seriously touched.

New study shows LLMs outperform neuroscience experts at predicting experimental results in advance of experiments (86% vs 63% accuracy). They use a fine-tuned Mistral 7B but other models worked too. Suggests LLMs can integrate scientific knowledge at scale to support research.

My answer to AiSuite - focused even more on the GPU-poor. It's still in early development, but you can take a look here: github.com/mobarski/ai-... All feedback is more than welcome (comments, likes, github stars, reposts).

note to the gpu poor: you can still train

wow, fp8 training works well apparently pytorch.org/blog/trainin...

new Aider #codegen release

One of the best articles for understanding how Deepseek and Qwen are possible in Open-source LLMs. Deepseek: The Quiet Giant Leading China’s AI Race Annotated translation of its CEO's deepest interview @jordanschneider.bsky.social China AI context: www.chinatalk.media/p/deepseek-c...

The wave of reasoning models from the Chinese community has arrived 🌊 ✨ Marco-o1 by AIDC huggingface.co/AIDC-AI/Marc... ✨ QwQ by Alibaba huggingface.co/collections/... ✨ Skywork-o1 by Kunlun Tech huggingface.co/collections/...

smear campaign ideas against grammar-constrained LLM decoding: "structured generation is like aim assist for language models --- only cheaters use it"

On-Chip Implementation of Backpropagation for Spiking Neural Networks on Neuromorphic Hardware #DL #AI #ML #DeepLearning #ArtificialIntelligence #MachineLearning #ComputerVision #AutonomousVehicles #Robotics #LLM #VLM #LVLM https://buff.ly/3COjDY4

This new LLM API adapter looks really nice == very similar to what I was doing in my project XD Less code to maintain == good news. github.com/andrewyng/ai...

Here’s a nice reddit post showing how quantization affects benchmarking results: www.reddit.com/r/LocalLLaMA...