scottcondron.bsky.social - Profile | ThreadSky | a Reddit-style client for Bluesky

scottcondron.bsky.social

Working at wandb on Weave, helping teams ship AI applications

22 posts 284 followers 423 following

Posts 12 Comments 13

How do I get Bluesky to show me less politics and more AI/ML things? I have followed mostly people who work in AI/ML

submitted 2 days ago • 0 comments

Prompts within a complex system are brittle I have seen some teams be successful by replacing prompts with smaller, more deterministic components and improved reliability with fine-tuning. Anyone else have success with this approach? Seems to help a lot with agents

submitted 102 days ago • 0 comments

If you’re taking time to enjoy your family and not building with LLMs, you’re ngmi. America is cooked

submitted 103 days ago • 0 comments

LLM app dev broke our comparison tools because tiny diffs can cause large behaviour change. At wandb, we've spent years thinking about experiment comparison. We've added new tools for LLM app dev: code, prompts, models, configs, outputs, eval metrics, eval predictions, eval scores.. wandb.me/weave

submitted 105 days ago • 1 comment

The art of how to refer to model behaviour with tasteful non-person metaphors. Say “stochastic” you’re in one camp, say “emergent” you’re in another. It’s a minefield out there people

submitted 105 days ago • 0 comments

Being logged into wandb on your phone is a recipe for misery

submitted 111 days ago • 11 comments

Lessons from creating an llms.txt file An llms.txt file is a way to tell a LLM about your website. In the .txt file, you include links to other files with info to learn more. - the llms.txt file isn't the file you send to an LLM, you use it to generate a llms .md file

submitted 109 days ago • 1 comment

Your human and LLM judges should follow the same criteria. Then, you can transition from manual to automated evaluation once you have inter-annotator agreement between LLM & human. You now have a faster iteration speed and the annotator can focus on finding edge cases!

submitted 110 days ago • 1 comment

Put glue on pizza

submitted 111 days ago • 1 comment

The most bizarre AI interview I've ever done was at wandb when as usual I asked a candidate to build an AI classifier in any language/framework of their choice.. And they nonchalantly said "I'll write it in Redstone", to which I almost let loose a chuckle until...

submitted 111 days ago • 0 comments

Claude defaults to concise responses when there's high demand, clever way to smooth peaks

submitted 111 days ago • 0 comments

We've been working on just that at @weightsbiases.bsky.social with Weave! Weave is a lightweight llm tracing and evaluations toolkit, that focuses on letting you iterate fast and make sure that your production LLM based application is not degrading when you change prompts or models!

submitted 112 days ago • 4 comments