Profile avatar
datateam.bsky.social
Data engineer & Cofounder @dlthub. Building out the tooling i wish i had.
84 posts 2,031 followers 1,300 following
Prolific Poster
Conversation Starter

data doesn't lie, politicians do

Data teams donโ€™t move fast. They wait. They wait for dbt runs. They wait for queries. They wait to see if something breaks. Stop testing in production. Start testing like an engineer. ๐Ÿ”— dlthub.com/blog/staging #databs #dataengineering

Stop waiting on your warehouse. dlt+ Cache lets you run dbt models, test data, and debug transformations instantlyโ€”with DuckDB, Iceberg, Snowflake, BigQuery, ClickHouse & more. Fast. Composable. No waste. ๐Ÿ”— dlthub.com/blog/cache

Is your job going to be replaced by AI? Probably. But just as weโ€™ve moved on from writing assembly code, our roles are evolving rather than vanishing. Embrace change, hone your expertise, and be proactive about learning. As the saying goes, "when the student is ready, a teacher will appear."

New in early access: dlt+ Project & Cache! โšก Test transformations locally before loading ๐Ÿ’ฐ Reduce cloud costs, iterate faster ๐Ÿ’ก Define & manage pipelines in YAML Try it out! ๐Ÿ”— dlthub.com/blog/dltplus... #DataEngineering #databs

check it out!

๐Ÿš€ Live Event w MotherDuck & dltHub: Fast & Scalable Analytics Pipelines ๐Ÿ“… Feb 26 | 17:00 CET | Zoom ๐Ÿ”น Easy data ingestion ๐Ÿ”น Custom ETL pipelines in Python ๐Ÿ”น Smooth transition from DuckDB to MotherDuck ๐Ÿ”ฅ Live demo + Q&A! ๐Ÿ”— Register: lu.ma/79a7lysr?utm... #databs

Testing in production isnโ€™t a rite of passage, itโ€™s a failure of process. ๐Ÿ’ฐ Every test query = vendor profit ๐Ÿšจ Every mistake = production issue ๐Ÿ˜ฉ Every debug run = rerunning the entire pipeline Staging for data is here dlthub.com/blog/staging #dataengineering #databs

Debezium + CDC + Python + dlt โ†’ Real-time PostgreSQL replication, no Kafka. ๐Ÿ”ฅ New blog by OSS contributor Ismail Simsek โ†’ A Python-native CDC pipeline for PostgreSQL & DuckDB. โœ… No Kafka โœ… Python-first โœ… Step-by-step guide Read it now โ†’ dlthub.com/blog/debeziu... #dataengineering

Breaking into data engineering? Join our Data Talks Club Data Engineering Zoom camp workshop and learn how to load data like a pro lu.ma/quyfn4q8

Writing Iceberg or Delta on filesystem/buckets? dlt supports both: dlthub.com/docs/dlt-eco... And if you wanna Bring Your Own Compute Engine, dlt supports that too (via ibis) dlthub.com/blog/datasets #databs #dataengineering #datasky

๐Ÿ”ฅ The biggest lie in data engineering?โ€จ"This tool will make your life easier." Commoditized tools are built for selling, not engineers.โ€จThey hide complexity & costs, locking you in. The alternative? Democratization, flexibility, and control. read more: dlthub.com/blog/goodbye... #databs

๐‡๐จ๐ฐ ๐๐จ ๐ฒ๐จ๐ฎ ๐ซ๐ฎ๐ง ๐๐ฅ๐ญ ๐จ๐ง ๐€๐–๐’ ๐ฅ๐š๐ฆ๐›๐๐š? (tldr - 100x cheaper than Saas) Here are code templates: Github: github.com/codingcyclis... Blog: dlthub.com/blog/dlt-aws... We use GCP ourselves: dlthub.com/blog/dlt-seg... #databs

2x your data ingestion pipeline development speed with Cursor AI. Mooncoon ๐Ÿฆ, a consulting partner at dltHub, shares a step by step tutorial how to do it. See the guide + git repo here: dlthub.com/blog/mooncoon

โœจ OSS Enterprise success story time! Stellantis (14 car brands including Jeep & Maserati) is using @dlt to: - Cut 60+ data tools down to 4 - Speed up pipelines by 66% - Onboard devs in 2hrs Open source eating enterprise, one pipeline at a time ๐Ÿš€ Check it out here! www.youtube.com/watch?v=Kj3E...

Tired of juggling tools & formats to access data? With dltโ€™s universal interface, you can: ๐Ÿ”„ Switch between local & cloud with ease โšก Test data w/o touching prod ๐Ÿš€ Build portable, scalable pipelines dlthub.com/blog/datasets #databs

Fed up with scaling struggles & custom API chaos? ๐Ÿ˜ค Meet the DLT REST API Source: โš™๏ธ Python-native configs ๐Ÿง  Auto schema detection ๐Ÿ’ผ Multi-source compatibility From POS systems to HR analytics, simplify pipelines. Watch Williโ€™s demo here: www.youtube.com/watch?v=9hZL...

Data platforms deserve SDLC best practices ๐Ÿš€ โœ… Feature branches โœ… CI checks for every PR โœ… Seamless CD to prod โœ… Full lineage in Dagster UI Built w/: ๐Ÿ”ง dltHub - ELT made simple ๐Ÿ’ก dbt - Transformations redefined โš™๏ธ Dagster - Asset-based orchestration from community: medium.com/@jairus-m/th...

Working with spatial data in PostgreSQL? ๐ŸŒ dlthub + PostGIS makes it simple: โœ… Load WKT or WKB formats โœ… Handle SRIDs (default 4326) โœ… Spatial queries made easy Just CREATE EXTENSION postgis; & go! ๐Ÿš€ Geometry, simplified. Whatโ€™s your next spatial project? #GIS dlthub.com/docs/dlt-eco...

Are you back from vacation? I'm sorry, me too. Now, where did we leave off? Oh yes, a tribute to our community! Check out this comprehensive list of demos and write ups done by our community last year! dlthub.com/blog/2024-re...

nice writeup on dlt +dagster medium.com/@veligokayso...

I've had a crazy few days with my advent calendar of code, looking at Tobiko's SQLMesh with @duckdb.org. Yesterday, I mentioned in my post that I needed to bring in some data from @bsky.app's HTTP endpoints, and I was going to try using dltHub. davidsj.substack.com/p/dlt-windsu...

You can now query dlt data before loading dlthub.com/docs/general...

Talk to a github repo, the variant you can just try try here: github-assistant.com ๐Ÿ“– Learn how it works: lnkd.in/es-JQraG Shout out to Relta.dev team for making this demo

Your data's so dirty, we call the extraction pipeline the "sewage treatment plant". #dataroast

Forget about manually writing dbt models. Meet dbt-gen: AI-generated models that work like magic (but with real results). Save time, avoid errors, and focus on the data that matters. Curious? ๐Ÿ‘‰ dlthub.com/blog/dbt-gen

Google announced it's working on making Python faster! jokes aside, check out this progress #databs www.hpcwire.com/2024/12/09/g...

Docker for data? If you go to a docker conference you will hear "portability" "fast onboarding" "lower cognitive load" "implicit access" "microservices" "decentralisation" Who does this for data? check out these 2 new products in the space: dlthub.com/blog/tower

Customer support bots: wasting my time with relentless efficiency ๐Ÿ‘

Check out this cool exploration by @jayatillake.bsky.social of dlt+sqlmesh! davidsj.substack.com/p/sqlmesh-in...

How does a country elect a psychotic fascist president who believes H2O is a conspiracy? 66.000 fake Tik tok accounts and advertising money sponsored by the axis of evil. How did you get yours? Today I'm voting against him. www.euronews.com/my-europe/20...

Our ELT with DLT workshop is back, as a self paced course! So if you wanna learn some durable knowledge this holiday season, sign up for the course! If you submit homework by 17th Jan, we will grade it and grant certificates. Sign up here! dlthub.com/events #DataEngineering #databs #python #etl