Pipeline Data Engineering Academy home blog pages letters

The Data Janitor Letters - December 2020

Data engineering salon. News and interesting reads about the world of data.

An analytics engineer is really just a pissed off data analyst
Seth Rosen, Co-founder, Hashpath

... who has the tools and motivation to make things better for everyone else.


Avro Schema Evolution Strategies on Kafka
Tomasz Kaszuba, Java Big Data Engineer, Swiss Re

I think the only place that the schema registry makes sense is when controlling 3rd party connections or in simple Kafka architectures.


The Big Little Guide to Message Queues
Sudhir Jonathan, Technical Architect, Qube Cinema

Fundamental concepts that underlie them, and how they apply to popular queueing systems available today.


Planning joins to make use of indexes
Zach Musgrave, Software Engineer, DoltHub

Dolt is Git for Data. It's a SQL database that you can clone, fork, branch, and merge.


Back to the '70s with Serverless
Cees de Groot, Principal Software Engineer, Canary Monitoring, Inc.

History will repeat itself, of course.


Going from 5K to 3M Messages/sec with 2ms Latency
Market Pulse

The evolution, failures and design decisions behind one of the world’s largest real-time, high-frequency and low-latency streaming systems.


We Burnt $72K testing Firebase + Cloud Run and almost went Bankrupt
Sudeep Chauhan, Founder, Milkie Way Inc.

This is the story of how close we came to shutting down before even launching our first product, how we survived, and the lessons we learnt.