All things data newsletter #16

Standard

(if this newsletter was forwarded to you then you can subscribe here: https://insightextractor.com/)

The goal of this newsletter is to promote continuous learning for data science and engineering professionals. To achieve this goal, I’ll be sharing articles across various sources that I found interesting. The following 5 articles/videos made the cut for today’s newsletter.

(1) Data & AI landscape 2020

Really good review of the yera 2020 of data & AI landscape. Look at those logos that represent bunch of companies tackling various data and AI challenges — it’s an exciting time to be in data! Read here 

2020 Data and AI Landscape
Image Source

(2) Self-Service Analytics

Tooling is the east part, it’s the follow-up steps needed to truly achieve a culture that is independently data-drive. Read here

(3) What is the difference between data pipeline and ETL?

Really good back-to-basics video on difference between Data pipeline and ETL.

(4) Delivering High Quality Analytics at Netlfix

I loved this video! It talks about how to ensure data quality throughout your data stack.

(5) Introduction of data lakes and analytics on AWS

I have another great Youtube video for you. This one introduces you to various AWS tools on data and analytics.

Thanks for reading! Now it’s your turn: Which article did you love the most and why?

All things Data Newsletter #15 (#dataengineering #datascience #data #analytics)

Standard

(if this newsletter was forwarded to you then you can subscribe here: https://insightextractor.com/)

The goal of this newsletter is to promote continuous learning for data science and engineering professionals. To achieve this goal, I’ll be sharing articles across various sources that I found interesting. The following 5 articles made the cut for today’s newsletter.

(1) Scaling data

Fantastic article by Crystal Widjaja on scaling data. It shares a really good framework for building analytics maturity and how to think about building capabilities to navigate each stage. Must read! Here

three stages.png
Image Source: reforge

(2) Building startup’s data infrastructure in 1-Hour

Good video that touches multiple tools. Watch here: https://www.youtube.com/watch?v=WOSrRTaNIm0 (it’s a little outdated since it was shared in 2019 which is 2 years ago but the architecture is still helpful)

(3) Analytics lesson learned

If you haven’t read lean analytics, I recommed it! After that, you should read this free companion which covers 12 good analytics case studies. Read here

(4) Organizing data teams

How do you organize data teams? completely centralized under a data leader? or do you structure it de-centralized reporting into leaders of business functions? some good thoughts here

Image Source

(5) Metrics layer is a missing piece in modern data stack

This is a good article that encourages you to think about adding metrics layer in your data stack. In the last newseltter, I also shared an Article that talks about Airbbn’s Minerva metrics layer and this article does a good job of providing additional reasons to build something simiar. Read here

Thanks for reading! Now it’s your turn: Which article did you love the most and why?

All things data newsletter #13 (#dataengineer #datascience)

Standard

(if this newsletter was forwarded to you then you can subscribe here: https://insightextractor.com/)

The goal of this newsletter is to promote continuous learning for data science and engineering professionals. To achieve this goal, I’ll be sharing articles across various sources that I found interesting. The following 5 articles made the cut for today’s newsletter.

(1) The Modern Data Stack

Amazing artcile by Tristan Handy explaining modern data stack. If you are familiar with tools such as Looker, Redshift, Snowflake, BigQuery, FiveTran, DBT, etc but wondered how each of them fit into an overall architecture, then this is a must read! Read here

Image Source: GetDBT by Tristan Handy

(2) How can data engineering teams be productive?

Good mental model and tips to build a productive data engineering team. Read here

(3) Why is future of Business Intelligence open source?

From the founder of Apache superset on why he beleives that the future of BI is open source? Read here.

(This is also a great marketing pitch for Apache Superset so please read this with a grain of salt and be aware about author’s bias on this topic)

(4) How Data and Design can work better together?

Diagnose with data and Treat with Design. Great artcile by the Julie Zhuo here

(5) Zach wilson believes that standups can be less productive in data engineering teams compared to software engineering teams

Interesting observations on his LinkedIn thread here

Thanks for reading! Now it’s your turn: Which article did you love the most and why?