jsulz.com - Profile | ThreadSky | a Reddit-style client for Bluesky

jsulz.com

I like pretty things, functional things, funny things, food things, and computer things. Used to do devops/WordPress things @lexblog.bsky.social and devex/cloud infra things at @pantheon.io Now helping make things go fast at 🤗 @hf.co

151 posts 217 followers 99 following

Posts 26 Comments 23

comment in response to post

Congratulations! Hope you're getting some sleep in here and there 😀

submitted 18 hours ago

comment in response to post

Love this line (and so true), "it is much easier to bring our compute to data than it is to bring our data to compute"

submitted 2 days ago

comment in response to post

Along with the rest of the crucial services of the internet, it is back up. But I'll never know how often she was chasing squirrels for those few hours.

submitted 8 days ago

comment in response to post

I'm not so interested in the other site for personal reasons. I'll continue to carve out a space here, but this is a reminder to find other spaces too. That said, if you're looking for a good way to stay in touch with the AI/ML community here, this post from @nsaphra.bsky.social is a good primer

submitted 25 days ago

comment in response to post

To migrate your existing repos to Xet, sign up here huggingface.co/join/xet And we'll take care of the rest 🤗

submitted 25 days ago

comment in response to post

What are your criteria for a great slice of pizza?

submitted 27 days ago

comment in response to post

The mighty penguin knows no geographical boundaries.

submitted 36 days ago

comment in response to post

This sea seems warm for a penguin.

submitted 36 days ago

comment in response to post

And if you need them, simple installation instructions for hf-xet:

submitted 44 days ago

comment in response to post

And if you want those bytes to come faster and are on a high-performance machine, check out some of our latest releases at github.com/huggingface/... TL;DR: export HF_XET_HIGH_PERFORMANCE=1

submitted 44 days ago

comment in response to post

And moving all these bytes is no joke. Our content-addressed-store (CAS) is doing a lot of hard work, hitting up to 150 Gb/s as we migrate repos from LFS to Xet.

submitted 52 days ago

comment in response to post

We've also updated our repo graph which shows how Xet-backed repos share bytes with each other. Here you can see how different versions of the Qwen, Llama, and Phi models are grouped together. Interactive graph here: huggingface.co/spaces/xet-t...

submitted 52 days ago

comment in response to post

You should have a few emails in your inbox 😀

submitted 57 days ago

comment in response to post

My wife sends this to anyone who loses a dog they were close with: margaretcho.com/2012/05/17/r... Dogs are our better halves. We are lucky to be in their presence. 🫂

submitted 63 days ago

comment in response to post

Will repost anything with doggos 🐶🐕❤️

submitted 72 days ago

comment in response to post

Come find BERT island. Or see how datasets relate in practice, and how model libraries or tasks can tie repos together. It's a byte-level map of the Hub. The result is a beautiful visualization from Saba Noorassa and @reverius42.bsky.social that I’ve already lost way too much time to.

submitted 73 days ago

comment in response to post

This means repos can share content, letting us draw a graph of which repos share data at the byte level where: - Nodes = repositories - Edges = shared chunks - Edge thickness = how much they overlap

submitted 73 days ago

comment in response to post

Not sure if you've read this article from @emollick.bsky.social but I think you would like it. "Experts can easily assess when an AI is useful for their work through trial and error, but an outsider often cannot."

submitted 73 days ago

comment in response to post

This graph shows requests per second (rps) to our content-addressed store (CAS) right as the release went live (h/t to @rajatarya.com for the screenshot) yellow = GETs; dashed line = launch time. I think it's pretty easy to spot when Xet started to send the first bytes to excited downloaders 👀

submitted 75 days ago

comment in response to post

If you go to any model in the collection, you'll see the Xet logo supporting the many TB of tensor files. Every request to download these files comes to our infrastructure.

submitted 75 days ago

comment in response to post

Thanks to the Meta team for launching on Xet!

submitted 76 days ago

comment in response to post

With the models on our infrastructure, we can peer in and see how well our dedupe performs across the Llama 4 family. On average, we're seeing ~25% dedupe, providing huge savings to the community who iterate on these state-of-the-art models. Here's a few selected models and how they perform on Xet.

submitted 76 days ago

comment in response to post

Streamline your efficiency with our proprietary municipal government machine learning model today. This feature evolves city services through predictive data analytics and is capable of doing things like reporting potholes before you even know they exist!

submitted 80 days ago