jaz.bsky.social
Jaz
Gender Nomad
IRC made me gay
Backend (Go) & Infra @ Bsky
Does musical things and computer things
27. they/them 🏳️⚧️
BSky Stats- https://bsky.jazco.dev/stats
https://github.com/ericvolp12
5,869 posts
54,491 followers
315 following
Regular Contributor
Active Commenter
comment in response to
post
For another stats site that _is_ run completely third party and not by someone who eventually joined Bluesky, check out bskycharts.edavis.dev/bluesky-day....
by @edavis.dev
comment in response to
post
The stats site leverages my full bsky index at the moment (which is ~3.6TiB on NVME)
I could probably rewrite it to be lightweight and only track stats from the firehose and then just backfill it from my existing historical summary stats over a weekend sometime soon.
Could run that on a tiny VPS
comment in response to
post
Everything that powers the stats site is FOSS and can be found in github.com/ericvolp12/b...
I wouldn't recommend trying to run this whole repo yourself but you can browse around to see the code that's used for consuming from the firehose and producing stats.
comment in response to
post
FWIW I _do_ work at Bluesky now but I didn't when I built the stats site and it still runs on a server in my apartment that just consumes from the public firehose.
comment in response to
post
this is bait
comment in response to
post
comment in response to
post
hmmm, concerning
how do i know my data is safe with him?
comment in response to
post
Please _do_ leave cardboard boxes in the data hall
comment in response to
post
NOC - Network Operations Cat
comment in response to
post
I have no idea, I don't work on the app at all, just the backend. Maybe @samuel.bsky.team can help you.
comment in response to
post
Which feed is this for?
comment in response to
post
That would be fine :)
comment in response to
post
Actually looking closer at this thread, it's got the same wonky behavior in both PoPs so this might be related to the new threadgate logic somehow. Will check w/ the team, probably unrelated to the DB migration.
comment in response to
post
@martin.kleppmann.com 👀👀👀
comment in response to
post
Yeah we're migrating the other PoP's DB now, I'm gonna run repairs on all the nodes once it's finished. Unfortunate that growing the DB cluster doesn't "just work" (tm).
comment in response to
post
rn it's at 4k but it might change based on load, seems to be a decent place for now
comment in response to
post
Per-user rate limits require much higher synchronization between individual fanout processes than a list of user/following-counts that only covers a tiny percentage of the userbase. The list is maintained every time we process a follow/unfollow in Redis and is refreshed only periodically for reads.
comment in response to
post
It's based on # of posts and right now somewhere between 3k-4k posts.
comment in response to
post
Yep, I've got a draft proposal for "Timeline Hibernation" that can hibernate a timeline and stop fanning out to it, then regenerate it on-demand once and re-enable fanout to it. Requires a bit of coordination between read and write paths but totally doable and will help reduce costs of our backend.
comment in response to
post
The link for "tail latencies" covers it pretty thoroughly!
comment in response to
post
Glad to hear that!
I try to keep my writing approachable and learn in the open as much as possible, it's great to know it's working!
comment in response to
post
Yep, that was my thinking. Someone who follows 20k people is probably not mainlining the following feed but has a bunch of other feeds they use to keep up with the people they care about.
comment in response to
post
The threshold is much higher than where you are now, don't worry :)
comment in response to
post
Erin is very cool and they can be trusted to print your badges
comment in response to
post
with 11 herbs and spices too, life comes at you fast
comment in response to
post
ikr
comment in response to
post
Following feed is "push" but it's all on the backend. We keep ~3,500 post and repost references hot for you in the feed and if you scroll past that, it will fall back to something else iirc. It's all on disk, but each user has their own "inbox" in the DB for their feed basically.
comment in response to
post
All reads in service of write requests are quorum reads but that doesn't really help in this particular case. I _think_ the repairs I'm running should resolve the issues though.
comment in response to
post
ok yeah, i've been running some more repairs cause i wanted to fix things. definitely just an issue of some of the new DB nodes not having an accurate picture of some of the data and since we're not using quorum for read-only API requests, we don't have a chance to automatically heal via row-repairs
comment in response to
post
yeah I prolly need to run some more repair jobs in the cluster next week to get things fully happy again
comment in response to
post
I have no idea, might be a question for @samuel.bsky.team
comment in response to
post
(Clearly some alarms that need tuning that got muted during the migration since there's been a lot of node churn as we upgrade versions and add new machines etc.)
comment in response to
post
Thanks for flagging this BTW, with DB cluster drift the sooner you catch it the better :)
comment in response to
post
Some of that could be content from accounts that are taken down but haven't had their writes unwound. In this case could you still see the replies somewhere else (i.e. in notifications)?
comment in response to
post
We had a DB node that was unhappy and left the cluster over the weekend. I brought it back up and ran a repair, looks like the materialized views are happy now. @em.vg wanna check those threads now to see if they show replies properly? I'm seeing them from the API now :)
comment in response to
post
Yeah ok confirmed this is cause we've got data in the source tables that isn't included in the materialized view properly. Some Scylla wonkiness emerging from running our cluster in a split state right now. Hopefully we can sort it out early this week. The data is indexed, just misbehaving views.