Profile avatar
tudor.golubenco.org
CTO at Xata.io, a Postgres platform. I’m sharing Postgres tips and our progress on pgroll, open source zero-downtime schema changes for Postgres.
49 posts 108 followers 177 following
Regular Contributor
Active Commenter
comment in response to post
Did it hallucinate the link to the support chsnnel?
comment in response to post
8. Neosync (github.com/nucleuscloud...). It's written in Go and MIT licensed. This one is different from the others in the sense that it's not a CLI tool but uses a web-UI configuration and workflow.
comment in response to post
Enjoy Turkey!
comment in response to post
Yes, it works with RDS and Aurora.
comment in response to post
The website is the fantastic work of @elizabet.dev
comment in response to post
The pricing model is also going to be really interesting. If it will be like DynamoDB with no base cost, and cheap enough, I can imagine many builders / indie devs are going to like it. Other dev-oriented platforms are going to build on top of it, like they build today on top of lambda.
comment in response to post
But the query processors run modified Postgres code, which is a good choice, so I imagine they will slowly add/enable more functionality and therefore reduce the compatibly gap. The question is, how close can they get? How close do they want to get?
comment in response to post
Added Yugabyte, CockroachDB and DSQL Postgres compatibility score (PCI)
comment in response to post
Awesome, let me know if you want help with filling some vendors to test.
comment in response to post
I think we need levels of it. Level 1 is “uses the Postgres wire protocol” and nothing else. Level 5 is actual Postgres with any extensions. And then a few levels in between.
comment in response to post
👋 yay, now we have both you and your spirit
comment in response to post
7. Snaplet snapshot (github.com/supabase-com...). Does both subsetting and anonymization via transformations. Snaplet (the company) unfortunately shut down earlier this year, and while they open-sourced snapshot under supabase-community, it looks unmaintained right now.
comment in response to post
6. Replibyte (github.com/Qovery/Repli...) from Qovery, is written in Rust and works by parsing the dump file. It can do both anonymization and data subsetting (sampling) and supports more databases than just Postgres.
comment in response to post
5. tidus (github.com/viafintech/t...) from Viafintech. This one uses Postgres views, which means the anonymizing functions are written in pl/pgsql and the work happens in Postgres. Then a pg_dump wrapper will get the data through the views rather than directly. Neat!
comment in response to post
4. masquerade (github.com/TonicAI/masq...) from TonicAI. This is one is a proxy speaking the Postgres wire protocol. It can mask columns in the response. It runs on the .net runtime.
comment in response to post
3. github.com/rap2hpoutre/... by @rap2h.bsky.social is written in typescript and wraps pg_dump. It parses its output and replaces the values. It can use faker to generate the replacement values, which is pretty cool.
comment in response to post
2. Greenmask (github.com/GreenmaskIO/...) is written in Go and acts as a pg_dump replacement. The schema part is outsourced to actual pg_dump, while the data part is passed through anonymization.
comment in response to post
1. PostgreSQL Anonymizer: postgresql-anonymizer.readthedocs.io/en/stable/ It's an extension, which is nice because it can dynamically mask data depending on the user, for example. Then you just run pg_dump as that user. But being an extension is also a drawback, because e.g. RDS doesn't have it.
comment in response to post
While expand/contract is a multi-step process, pgroll manages to do it as a single operation. How? It uses Postgres view to show both the old version of the schema and the new version of the schema **at the same time**. This means you can do a rolling upgrade of your app without worries.
comment in response to post
You first “expand” the table by adding a new column. You use them both for a period of time, backfill data, then you “contract” by dropping the old column. pgroll automates this for you. For the "change type" migration type, it does pretty much exactly the above steps. github.com/xataio/pgroll
comment in response to post
So how would you do it safely? You break it in multiple steps: - add a new integer column - use a script to backfill the data to the new column, converting it - switch the application to use the new column - drop the original text column This is an example of the expand/contract pattern.
comment in response to post
While the ALTER statement executes, it has the ACCESS EXCLUSIVE lock, which means no other query can read or write to that table. That’s essentially downtime.
comment in response to post
Doing the above works, but what if the foo table is large enough (say, a few million rows)? Postgres needs to modify all those rows, which will take some time (say, minutes).