Profile avatar
taidesu.bsky.social
Open Source enthusiast and builder of search systems. Currently building search at GitHub. Former DA for open-source OpenSearch @AWS. Opinions are my own.
234 posts 191 followers 93 following
Regular Contributor
Active Commenter

I love these, “cheap” builds on YouTube. Watching one now where a guy uses like 15k of professional: - welders - cnc’s - belt sanders - air tools To create his “cheap cafe racer” don’t get me wrong I know these things could be done with cheaper tools but it’s just a bit disingenuous.

It bugs me so much that I keep getting observability job listings send from LinkedIn. JUST BECAUSE I USE ELASTICSEARCH DOESNT MEAN I DO OBSERVABILITY 😭

It’s always funny working with people who don’t know my background. Someone linked me the Elasticsearch docs trying to prove me wrong about Elasticsearch’s allocation. I linked them back the code 😝

For the past 4 weeks, @oppili.bsky.social and I have been building @tangled.sh—a fresh take on decentralized git collaboration platforms. Tangled is built on atproto, and designed to be decentralized from day 0. Read the introduction here: blog.tangled.sh/intro

It’s so funny seeing the reactions to where I work from non-tech people. At AWS: - oh you work for Amazon?! Do you ever meet with Bezos? - oh are you delivering packages? At GitHub: - GitHub… is that the food delivery company? - what’s that? - I think I’ve heard of them 🤔

Nothing like shipping a fix to a bug that’s been laying silent for 2 years. Now I get to wait another 1-2 years for customers to upgrade and see my fix 😆

Sometimes I think I enjoy talking about search more than implementing it 🤭 Super fun talking to Kyle from RunWhen about building better AI with high quality search. youtu.be/ergc-NW3YDY?...

In 2 weeks I will be attempting to “run” the Cambridge Half Marathon with my mum and aunt again and am raising funds for cancer research while doing so 🩵👇 PS: No I don’t have a pace aim.. I will crawl if necessary. fundraise.cancerresearchuk.org/page/tuanas-...

I’m losing it with the term overloading. Cluster for GitHub enterprise could mean either HA or actually clustered. Both of which are incompatible with how Elasticsearch defines a cluster. 😭

I’ve said for a long time that I think the company that creates search relevance as a service will win the game… I’ve come to believe that’s an impossible task. Search relevance is a data science problem so closely coupled to users data that it’s hard to believe there will ever be a solution.

With literally one post I’m suddenly plugged into the motorcycle community. This is what I missed about pre-Elon Twitter. It was so cool being able to just have entire communities engaging with one another. Post Elon twitter everything feels so segregated. It’s hard to break out of your niche.

This was 10/10 the highlight of my weekend and maybe even my month. In February of last year I wrecked my motorcycle and broke my shoulder. This past weekend I got the bike started again for the first time and it feels so good 😊

I have the stomach flue… 4th time I’ve had this in the last 12mo. It’s become predictable: Hour 1: I start to have sour stomach Hour 2: throw up a ton Hour 3-4: drink a little water and body aches Hour 5: throw up water Hour 8: I can start eating Hours 8-24: body aches and recover

paradedb: Postgres for Search and Analytics ★6726 https://github.com/paradedb/paradedb

I think the hardest part about doing relevancy on GitHubs data is it’s constantly changing. For example, say I save a subset of issues and we have some sample queries for it. I’ll build a judgement set off of it and then I can calculate NDCG. What happens if I want to add attributes?

It’s amazing how long someone can work on search and not think relevancy is important. Are you making search better or worse? There’s no way to tell aside from measuring. In e-commerce you could measure sales but that can be a shallow metric to drive.

Today I was also reminded that there’s no “easy” substitute for relevance. We did some math to work out what the cost to roll out vector search would be… we’d need to increase our compute budget nearly 50%…. Just to roll out semantic search for issues.

Been a minute but fresh reminder I’m continuing to update the Search Engineering starter pack 👀 go.bsky.app/A8owiAA

Still got it 🔥 I was really nervous that I’d have forgotten because it’s been like ~6 years since I last went skiing. Still fell but that’s all part of the fun (so long as you can still get back up that is 😆)