hamel.bsky.social - Profile | ThreadSky | a Reddit-style client for Bluesky

hamel.bsky.social

Building something new 👉 http://nbdev.fast.ai Ex Github, Airbnb, DataRobot. ML / Data Tooling & OSS

102 posts 6,180 followers 654 following

Posts 18 Comments 32

Finally got around to trying Answer.ai 's "nbsanity": Works beautifully on the first try! Even renders my interactive Plotly stuff! Just replace "github" in your notebook URL with "nbsanity", as in... nbsanity.com/static/3465a... More info from @hamel.bsky.social : www.answer.ai/posts/2024-1...

submitted 32 days ago • 0 comments

In case you missed it, @hamel.bsky.social reviewed Devin. It succeeded on 3/20 assigned tasks.

submitted 51 days ago • 2 comments

Thoughts On A Month With Devin (the "AI software engineer") by @hamel.bsky.social "Out of 20 tasks we attempted, we saw 14 failures, 3 inconclusive results, and just 3 successes. More concerning was our inability to predict which tasks would succeed."

submitted 53 days ago • 0 comments

Enjoyed the systematic first hand reporting of their experience using Devin by @hamel.bsky.social and team. If you’ve worked with llm coding assistants, the results aren’t surprising, but it points to how far these models still need to go and should be worrying for how effective “agents” will be.

submitted 52 days ago • 1 comment

Thoughts On A Month With Devin by @hamel.bsky.social They decided to put it through its paces, testing it against a wide range of real-world tasks. This is their story - a thorough, real-world attempt to work with one of the most hyped AI products of 2024. www.answer.ai/posts/2025-0...

submitted 54 days ago • 3 comments

Four steps to use evals effectively in LLM applications (we haven't done the last one but are still getting great results): Eval Driven Development is the new TDD for LLM based applications. Without them, you're flying blind. #cto #llm #ai #tech #dev #genai

submitted 66 days ago • 1 comment

New LLM Eval Office Hours, I discuss the importance of doing error analysis before jumping into metrics and tests Links to notes in the YT description youtu.be/ZEvXvyY17Ys?...

submitted 80 days ago • 1 comment

This is pretty damn nifty! @hamel.bsky.social @projectjupyter.bsky.social #datascience #jupyternotebooks www.answer.ai/posts/2024-1...

submitted 81 days ago • 2 comments

Our team at @specstory.com launched our very first product iteration today. What is it? An extension for @cursor_ai that allows you to save and share your composer and chat history. Give it a try at marketplace.visualstudio.com/items?itemNa... and let us know what you think!

submitted 85 days ago • 0 comments

Recoded my second office hours on LLM Evals. We talked about observability and how to prioritize writing tests in complex systems Here are the notes: hamel.dev/notes/llm/of... Video: youtu.be/TZwmLXXFbh4?...

submitted 85 days ago • 2 comments

Running this notebook from @howard.fm hoping that it removes noise from my timeline nbsanity.com/static/0b3fd...

submitted 87 days ago • 3 comments

Make it easier to manually inspect your data! I built a small Shiny for Python web app as recommended by @hamel.bsky.social. I'm getting through my task much faster than previous iterations

submitted 89 days ago • 1 comment

I'm proud that we're going public with some positioning on what @honeycomb.io actually believes AI represents: A new, weird, and sometimes janky kind of virtual computer. Stay tuned for a lot more clear-headed posting on applied AI in the coming year. www.honeycomb.io/blog/observa...

submitted 90 days ago • 2 comments

Fantastic use of shot-scraper.datasette.io here to. Create social media cards for this new Jupyter Notebook rendering site nbsanity.com

submitted 93 days ago • 1 comment

Super exciting update to nbsanity. I've incorporated @simonwillison.net 's shotscraper. Now, all new renders get a fancy social card! This makes nbsanity a nice microblogging utility Examples of rendered notebooks: 1/3 nbsanity.com/static/6a987...

submitted 93 days ago • 1 comment

nbsanity now has a bookmarklet nbsanity.com It's a static server that renders public Jupyter notebooks with Quarto

submitted 94 days ago • 2 comments

Are you frustrated by how GitHub renders Jupyter notebooks? I have public service that renders GitHub notebooks with Quarto nbsanity.com It now works with gists!

submitted 94 days ago • 8 comments

I am holding open office hours on LLM Evals. I recorded the first one which was about evaluating multi-turn chats Notes and recording here: hamel.dev/notes/llm/of...

submitted 95 days ago • 3 comments