narphorium.bsky.social - Profile | ThreadSky | a Reddit-style client for Bluesky

Imagine trying to explain to a non-technical person that this has nothing to do with GitHub Copilot 🫤

submitted 2 days ago • 0 comments

MUTAGREP improves code generation efficiency by navigating repositories, breaking tasks into grounded plans, achieving notable performance with under 5% of context. This enhances accuracy and lets smaller models rival GPT-4o, marking a leap in AI programming. https://arxiv.org/abs/2502.15872

submitted 3 days ago • 0 comments

100 Best Coffee Shops in the World and the only one I've been to is Stumptown? I need to be more discerning about where I get my bean juice theworlds100bestcoffeeshops.com/top-100-coff...

submitted 3 days ago • 0 comments

Neat visualization that came up in the ARBOR project: this shows DeepSeek "thinking" about a question, and color is the probability that, if it exited thinking, it would give the right answer. (Here yellow means correct.)

submitted 3 days ago • 4 comments

Claude Sonnet 3.7 with Claude Code has been a great help pairing with me on MCP SDK work. So stoked to see what people will build with it. www.anthropic.com/news/claude-...

submitted 4 days ago • 1 comment

Needed to display some code in my canvas and ended setting up a full Rich Text editor with Tiptap inside @tldraw.com . Shapes as React components is a very powerful paradigm.

submitted 7 days ago • 0 comments

I really like this idea from the SWE-Lancer paper where they give the agent a tool which simulates a user clicking through the app to verify that it works correctly. arxiv.org/abs/2502.12115

submitted 8 days ago • 0 comments

Colorpad - lets you analyze text by highlighting passages of text, defining what each color means and then exporting to JSON so that it can be easily fed into an LLM as a prompt open.substack.com/pub/shawnfro...

submitted 10 days ago • 1 comment

Love this idea of embedding R1 CoT to visualize it's thought process github.com/dhealy05/fra...

submitted 11 days ago • 1 comment

We are still looking for AI's #durablemutation. My latest blog post at: ideaspaces.net/unifinishede... @hrheingold.bsky.social @narphorium.bsky.social @edufuturist.bsky.social @hegland.bsky.social

submitted 11 days ago • 0 comments

Graphista : Dynamic Graph-Based LLM-Powered Memory System It integrates LLM-powered ingestion and querying tools to enable dynamic knowledge management. - ingest() will use read tools to dedupe nodes/edges, and add/update edges and nodes - ask() will use read tools to find answers in the graph

submitted 12 days ago • 5 comments

Forget Microsoftian, AI UI was getting downright Gordian. It's great to hear (if not yet see) that it's seemingly on the verge of some major simplification from an end-user perspective.

submitted 13 days ago • 0 comments

"Value creation will come from opening up new avenues for thinking about what AI applications could look like, what kinds of functionality could be useful, how users could interact with AI applications, and what kinds of business models could make sense." open.substack.com/pub/uncertai...

submitted 13 days ago • 0 comments

submitted 30 days ago • 1 comment

What if instead of searching the entire web you could just search your preferred pocket of it?

submitted 24 days ago • 2 comments

sharing an old exploration that's been top of mind recently the future is... ✨ a shared artifact between humans and AI ✨ content at different levels of abstraction ✨ multimodal inputs and NL controls

submitted 24 days ago • 6 comments

We give reasoning models credit for showing their work, just like on school tests. But what if they're just working backward from memorized answers—like we did?

submitted 25 days ago • 0 comments

I used the new citations feature in the Anthropic API to identify a set of supporting facts for each thought in an R1 CoT. I'm surprised at how well it works.

submitted 26 days ago • 1 comment

Finally got this prototype to work after 1000 years of debugging. Sweet release 🥲 The idea is I dump a bunch of unstructured notes on a topic in, feed that into gpt-4o-mini, ask it to label sections with a set of types – claim, evidence, assumption, etc. – and display them as coloured highlights

submitted 28 days ago • 25 comments

What if reasoning models structured their thoughts as nested bullet points; branching and backtracking through different ways of solving a problem?

submitted 27 days ago • 1 comment

We need to rethink interfaces for reasoning models like R1. Instead of shoehorning reasoning to chat, what if we designed the entire experience around reasoning from the start?

submitted 28 days ago • 0 comments

One of my favorite new tips pairing with an LLM is having it document "Lessons Learned" in a file in the codebase itself. Actually helps me solidify what I learned in the process as well.

submitted 29 days ago • 0 comments

looking at community .cursorrules is an interesting view into what characteristics people value in their code. there's a lot of "avoiding classes" in ts/js. cursor.directory

submitted 31 days ago • 0 comments

My flight was so cold tonight that I opened my laptop and ran an R1 eval to stay warm

submitted 32 days ago • 0 comments

Enjoying this charming collection of stories about Ted Nelson and how his ideas have changed our world

submitted 33 days ago • 0 comments

This taxonomy of failure modes for collaborative agents is super helpful when designing new types of AI-powered tools

submitted 40 days ago • 1 comment

LM agents today primarily aim to automate tasks. Can we turn them into collaborative teammates? 🤖➕👤 Introducing Collaborative Gym (Co-Gym), a framework for enabling & evaluating human-agent collaboration! I now get used to agents proactively seeking confirmations or my deep thinking.(🧵 with video)

submitted 42 days ago • 1 comment

glazkov.com/2025/01/15/r...

submitted 43 days ago • 0 comments

"I love what I do and I get to work on stuff I want to work on. I wish everybody had that opportunity."—David Lynch

submitted 43 days ago • 2 comments