jsulz.com
I like pretty things, functional things, funny things, food things, and computer things.
Used to do devops/WordPress things @lexblog.bsky.social and devex/cloud infra things at @pantheon.io
Now helping make things go fast at π€ @hf.co
151 posts
217 followers
99 following
Regular Contributor
Active Commenter
comment in response to
post
Congratulations!
Hope you're getting some sleep in here and there π
comment in response to
post
Love this line (and so true), "it is much easier to bring our compute to data than it is to bring our data to compute"
comment in response to
post
Along with the rest of the crucial services of the internet, it is back up.
But I'll never know how often she was chasing squirrels for those few hours.
comment in response to
post
I'm not so interested in the other site for personal reasons.
I'll continue to carve out a space here, but this is a reminder to find other spaces too.
That said, if you're looking for a good way to stay in touch with the AI/ML community here, this post from @nsaphra.bsky.social is a good primer
comment in response to
post
To migrate your existing repos to Xet, sign up here huggingface.co/join/xet
And we'll take care of the rest π€
comment in response to
post
What are your criteria for a great slice of pizza?
comment in response to
post
The mighty penguin knows no geographical boundaries.
comment in response to
post
This sea seems warm for a penguin.
comment in response to
post
And if you need them, simple installation instructions for hf-xet:
comment in response to
post
And if you want those bytes to come faster and are on a high-performance machine, check out some of our latest releases at github.com/huggingface/...
TL;DR: export HF_XET_HIGH_PERFORMANCE=1
comment in response to
post
And moving all these bytes is no joke. Our content-addressed-store (CAS) is doing a lot of hard work, hitting up to 150 Gb/s as we migrate repos from LFS to Xet.
comment in response to
post
We've also updated our repo graph which shows how Xet-backed repos share bytes with each other.
Here you can see how different versions of the Qwen, Llama, and Phi models are grouped together.
Interactive graph here: huggingface.co/spaces/xet-t...
comment in response to
post
You should have a few emails in your inbox π
comment in response to
post
My wife sends this to anyone who loses a dog they were close with: margaretcho.com/2012/05/17/r...
Dogs are our better halves. We are lucky to be in their presence. π«
comment in response to
post
Will repost anything with doggos πΆπβ€οΈ
comment in response to
post
Come find BERT island. Or see how datasets relate in practice, and how model libraries or tasks can tie repos together.
It's a byte-level map of the Hub.
The result is a beautiful visualization from Saba Noorassa and @reverius42.bsky.social that Iβve already lost way too much time to.
comment in response to
post
This means repos can share content, letting us draw a graph of which repos share data at the byte level where:
- Nodes = repositories
- Edges = shared chunks
- Edge thickness = how much they overlap
comment in response to
post
Not sure if you've read this article from @emollick.bsky.social but I think you would like it.
"Experts can easily assess when an AI is useful for their work through trial and error, but an outsider often cannot."
comment in response to
post
This graph shows requests per second (rps) to our content-addressed store (CAS) right as the release went live (h/t to @rajatarya.com for the screenshot)
yellow = GETs; dashed line = launch time.
I think it's pretty easy to spot when Xet started to send the first bytes to excited downloaders π
comment in response to
post
If you go to any model in the collection, you'll see the Xet logo supporting the many TB of tensor files.
Every request to download these files comes to our infrastructure.
comment in response to
post
Thanks to the Meta team for launching on Xet!
comment in response to
post
With the models on our infrastructure, we can peer in and see how well our dedupe performs across the Llama 4 family.
On average, we're seeing ~25% dedupe, providing huge savings to the community who iterate on these state-of-the-art models. Here's a few selected models and how they perform on Xet.
comment in response to
post
Streamline your efficiency with our proprietary municipal government machine learning model today.
This feature evolves city services through predictive data analytics and is capable of doing things like reporting potholes before you even know they exist!