Profile avatar
rlesser.bsky.social
Engineer at Nomic AI | building tools for seeing in a high-dimensional world
15 posts 57 followers 76 following
Regular Contributor

What are some great datasets hosted on Hugging Face? We just added a way to quickly import, embed, and build interactive visualizations from them. Would love to get some hidden gems in. E.g. here's 650,000 1933 newspaper articles from LOC, extracted by Melissa Dell. atlas.nomic.ai/data/nomic/a...

every single time, congestion pricing becomes way more popular after it’s implemented traffic sucks, but people refuse to believe it will go away until it actually does

This is a real loss, but my undisputed UWS GOAT Broadway Bagel on 101 remains very much alive. Long live the egg everything BEC

You know a side project is getting out of hand when you start making a settings page before anyone has actually used it

New blog post! Updated for 2024, my favorite example of why alphabetical ordering is bad for geographic features -- US presidential results since 1828. The left image shows regional patterns in a geographic ordering that the right (alphabetical) simply loses. benschmidt.org/post/2024-11...

What a beautiful win. The look on Ryan's face in that final moment made the whole season worth it!!!

Adventures in DuckDB + Huggingface + ArrowJS - If you try to stream arrow IPC out of HF with DuckDB, for some reason the batches come back in ~random order! ArrowJS decided to explode when this happens. Not even sure who's at fault here, but excited for this ecosystem to continue to mature #databs

Back in 2022 I published this post analyzing 15M tweets with Wordle results, with some fascinating results: observablehq.com/@rlesser/wor... The hardest part by far was gathering a huge dataset from a platform hostile to such analysis. Very excited that Bluesky encourages this type of exploration!

Great write up on how Val Town built Townie, probably the best LLM coding experience I’ve used. A huge part of what makes it so nice is how little boilerplate/infrastructure the VT environment needs. Easier for people and easier for LLMs when everyone can focus on the business logic alone.

One of the most exciting things I’ve worked on at Nomic. There is huge untapped potential in taking people on a journey through a dataset, especially for text and image sets that are currently so hard to reason about

This is now a force-directed-graph-posting account, all other content posted is anomalous behavior