All things data newsletter #16

Standard

(if this newsletter was forwarded to you then you can subscribe here: https://insightextractor.com/)

The goal of this newsletter is to promote continuous learning for data science and engineering professionals. To achieve this goal, I’ll be sharing articles across various sources that I found interesting. The following 5 articles/videos made the cut for today’s newsletter.

(1) Data & AI landscape 2020

Really good review of the yera 2020 of data & AI landscape. Look at those logos that represent bunch of companies tackling various data and AI challenges — it’s an exciting time to be in data! Read here 

2020 Data and AI Landscape
Image Source

(2) Self-Service Analytics

Tooling is the east part, it’s the follow-up steps needed to truly achieve a culture that is independently data-drive. Read here

(3) What is the difference between data pipeline and ETL?

Really good back-to-basics video on difference between Data pipeline and ETL.

(4) Delivering High Quality Analytics at Netlfix

I loved this video! It talks about how to ensure data quality throughout your data stack.

(5) Introduction of data lakes and analytics on AWS

I have another great Youtube video for you. This one introduces you to various AWS tools on data and analytics.

Thanks for reading! Now it’s your turn: Which article did you love the most and why?

All things Data Newsletter #15 (#dataengineering #datascience #data #analytics)

Standard

(if this newsletter was forwarded to you then you can subscribe here: https://insightextractor.com/)

The goal of this newsletter is to promote continuous learning for data science and engineering professionals. To achieve this goal, I’ll be sharing articles across various sources that I found interesting. The following 5 articles made the cut for today’s newsletter.

(1) Scaling data

Fantastic article by Crystal Widjaja on scaling data. It shares a really good framework for building analytics maturity and how to think about building capabilities to navigate each stage. Must read! Here

three stages.png
Image Source: reforge

(2) Building startup’s data infrastructure in 1-Hour

Good video that touches multiple tools. Watch here: https://www.youtube.com/watch?v=WOSrRTaNIm0 (it’s a little outdated since it was shared in 2019 which is 2 years ago but the architecture is still helpful)

(3) Analytics lesson learned

If you haven’t read lean analytics, I recommed it! After that, you should read this free companion which covers 12 good analytics case studies. Read here

(4) Organizing data teams

How do you organize data teams? completely centralized under a data leader? or do you structure it de-centralized reporting into leaders of business functions? some good thoughts here

Image Source

(5) Metrics layer is a missing piece in modern data stack

This is a good article that encourages you to think about adding metrics layer in your data stack. In the last newseltter, I also shared an Article that talks about Airbbn’s Minerva metrics layer and this article does a good job of providing additional reasons to build something simiar. Read here

Thanks for reading! Now it’s your turn: Which article did you love the most and why?

All things Data Newsletter #14 (#dataengineering #datascience #data #analytics)

Standard

(if this newsletter was forwarded to you then you can subscribe here: https://insightextractor.com/)

The goal of this newsletter is to promote continuous learning for data science and engineering professionals. To achieve this goal, I’ll be sharing articles across various sources that I found interesting. The following 5 articles made the cut for today’s newsletter.

(1) Analytics is a mess

Fantastic article highligting the importance of the “creative” side of analytics. It’s not always structured and that is also what makes it fun. Read here

(2) Achieving metric consistency at Scale — Airbnb Data

This is a great case study shared by Airbnb’s data team on how they achived metrics consistency at Scale. Read here

Image Source

(3) Achieving metric consistency & standardization — Uber Data

Another great read on metrics standardization — this time at Uber. As you can notice it’s a recurring problem at different organizations after hitting a certain growth threshold. This problem occurs since in the intial growth stage, there’s a lot of focus on enabling folks to look at metrics in a manner that’s optimized for speed. After a certain stage, this needs to balanced with consistency where the teams might have gone in different direction and they are defining the same thing in different way but that doesn’t scale anymore since you need some consistency and standardization. This is where the topic of metric consistency and standardization comes in. It’s a problem worth solving — and if you are interested, please read this article here

(4) Where is data engineering going in 5 years?

A good short post by Zach Wilson on LinkedIn talking about where data engineering is going over the next few years. Not surprised to see Data privacy in there! Read others here

(5) 3 foundations of successful data science team

An Amazon leader (Elvis Dieguez) talks about the 3 foundational pillards of a successful data team. This is comprised of 1) data warehouse 2) automated data pipelines 3) self-service analytics tool. Read here

Thanks for reading! Now it’s your turn: Which article did you love the most and why?

Business Analytics Continuum: Descriptive, Diagnostic, Predictive, Prescriptive

Standard

Think of “continuum” as something you start and you never stop improving upon. In my mind, Business Analytics Continuum is continuous investment of resources to take business analytics capabilities to next level. So what are these levels? 

Here are the visual representation of the concept:

business analytics continuum

Five Tenets for effective data visualization

Standard

Tenet is a principle honored by a group of a people. As a reader of this blog, you work with data and data visualization is an important element in your day-to-day work. So, to help you build effective data visualization, I created the tenets below which are simple to follow. This work is based on multiple sources and I’ll reference it below.

Five Tenets for effective data visualization:

  1. We will strive to understand customer needs
  2. We will tell the truth
  3. We will bias for simplicity
  4. We will pick the right chart
  5. We will select colors strategically

Examples for each tenet is listed below:

  • We will strive to understand customer needs

Defining and knowing your audience is very important before diving into the other tenets. Doing this will increase your probability of delivering an effective data visualization.

h/t to Mike Rubin for suggesting this over on LinkedIn here

  • We will tell the Truth

We won’t be dishonest with data. See an example below where Fox news deliberately started the bar chart y-axis at a non-zero number to make the delta look way higher than it actually is.

Source: Link

  • We will bias for Simplicity

3-D charts increase complexity for the end-users. So we won’t use something like this and instead opt for simplicity.

  • We will pick the right chart

I have linked some resources here

  • We will select colors strategically

Source here

Conclusion:

In this post, I shared five tenets that will help you build effective data visualization.

5 stages of Analytical Competition

Standard

I love mental model and frameworks. I have shared some frameworks on this blog already like 3 W’s (What, Why, what’s next) and 3 P’s (Platform, People, Process) focused on helping analytics leader figure what their analytics roadmap should be. I was reading ‘competing on analytics’ book and came across the 5 stages of Analytical competition which seemed like another framework worth sharing.

The two end of the spectrum are org is flying blind to org is competing through analytics. Stages are:

  1. Analytically impaired
  2. Localized Analytics
  3. Analytical aspirations
  4. Analytical companies
  5. Analytical competitor

You can read about each one of these here: Five Stages of Analytic Competition  and you can read a synopsis of the book here.

Great example of storytelling through data:

Standard

End of the beginning by Benedict Evans.

Two great posts on DAU/MAU and Measuring Power Users

Standard

Two great posts from Andrew Chen. Links below:

These posts were perfectly timed for me as we started thinking about Annual Planning for Alexa Voice Shopping org (Amazon) this week. As a part of my research of which metrics to use to measure things that our business cares most about and then setting the right benchmarks/goals for the org, the posts below were super helpful. So if you are in tech and if you care about 1) measuring frequency of usage 2) measuring the most engaged cohort then you should take some time to read these posts.

Power user curve 

DAU/MAU is an important metric to measure engagement, but here’s where it fails

Cheers!

Framework to evaluate great companies from “Good to Great” book by Jim Collins:

Standard

I like frameworks — it helps structure your thoughts. One of the most basic questions that I have asked looking at a company/org is to figure out how to evaluate the whether it’s good or great? And more importantly, how to help drive it to greatness? There’s a list of things that I could rattle off but it was not complete and also, I didn’t really have a structure. That is where the book “Good to Great” by Jim Collins comes into picture. It’s a great book that shares a “framework of ideas” for steering a company from good to great by sharing six key learning’s wrapped in a continual process he calls “flywheel”:

good-to-great-diagram

I encourage you to read the book if you can. But if you don’t have time, here’s a good overview:

https://www.youtube.com/watch?v=Yk7bzZjOXaM

Springboard Data Analytics for Business Office Hours

Standard

I was invited to lead the office hours for the Springboard’s Data Analytics for Business course and I wanted to share the recording with you all:

CLICK HERE

I answer following questions during the office hours:

  • What tools have I used in my career for Data Analytics & Data Science?
  • What are the different analysis/modeling that you do?
  • What are the biggest challenges that I found when I got in this Industry?
  • Being data-driven is not binary but it’s a scale — how do you do analyze what is their current level and how do you make a company more data-driven?
  • What is the challenge for newcomers in this industry? And what are the changes coming in next few years?
  • Which tools are widely used today? Which industry uses which tools heavily?
  • How do you verify “what’s next”? How do you verify that your forecast is good enough?

Related Post: $100 Discount Code For Springboard