Full video version with annotated chapters on Map for Engineers Youtube channel
Lauri Koobas, ex-Microsoft and currently Head of Data Platform at Bondora, shed insights on data engineering - from early startup to scaling.
We mostly focused on analytics and building data warehouse - real-world challenges from both data engineering and software engineering sides. We also discussed GDPR and PII challenges when dealing with data.
Annotated chapters in timeline:
00:00:00 Sneak peek of episode
00:01:21 Episode overview
00:02:44 Introduction, Lauri's background
00:20:48 Starship robots: huge amount of data there
00:23:37 Data lake, data warehouse, data lakehouse
00:26:44 Devil is in the details: timestamps, texts, character sets...
00:49:44 Moving data from prod to data warehouse
00:53:09 Analytics tools: PostHog, Amplitude, Redash, Databricks
01:00:15 Analytics tools vs real-time monitoring like Prometheus/Grafana
01:04:15 Usability matters: each tool for its job
01:06:38 Startup grows: needs in data analytics
01:11:09 Multiple data sources: when data warehouse really begins
01:19:55 Data and (de-)coupling: software engineers should not be blocked by analytics
01:22:51 Data ETL
01:24:59 Changes in data model: multi-phase migrations
01:29:38 Change data capture, incremental imports
01:34:21 Should analytics have new data in real time? Maybe not?
01:39:02 Importing data into DWH through business events
01:43:37 When DWH subscribes to business events, data model can evolve freely
01:47:16 Quick recap what we discussed so far
01:52:25 GDPR and Data Compliance: start early
01:56:05 PII data: know exactly where you store it, control it well
02:03:37 Lauri's books recommendations on data engineering - Kimball
02:07:18 Lauri's podcast on data engineering, in Estonian
02:08:28 Wrap up
Share this post