All things data newsletter #12 (#dataengineer #datascience)


The goal of this newsletter is to promote continuous learning for data science and engineering professionals. To achieve this goal, I’ll be sharing articles across various sources that I found interesting. The following 5 articles made the cut for today’s newsletter.

Why dropbox picked Apache superset as data exploration tool?

Apache superset is gaining momentum and if you want to understand the reasons behind that, you can start by reading this article here

Growth: Adjacent User Theory

I love the framing via this LinkedIn post here where Nimit Jain says that Great Growth PM output looks like “We discovered 2 new user segments who are struggling to proceed at 2 key steps in the funnel and simplified the product for them via A/B experiments. This lead to conversion improvement of 5-10% at these steps so far. We are now working to figure the next segment of users to focus on.”; you can read about the Adjacent user theory here

SQL window functions

Need intro to SQL window functions? Read this

Luigi vs Airflow

Really good matrix on comparing 2 popular ETL workflow platforms. Read here

A data engineer’s point of view on data democratization

If more people can easily access data that was previously not accessible to them then that’s a good thing. This is a good read on various things to consider, read here

Apache Superset growth within Dropbox:

superset adoption data graphics
Image Source: Dropbox Tech Blog

