Data Engineering and Data Science Newsletter #4

Standard

The purpose of this Insight Extractor’s newsletter is to promote continuous learning for data science and engineering professionals. To achieve this goal, I’ll be sharing articles across various sources that I found interesting. The following articles made the cut for today’s newsletter.

1. What does a Business Intelligence Engineer (BIE) do in Amazon?

Have you wondered what Analytics professionals at Top tech companies work on? Are you job hunting and wondering what data roles (data engineer, data science, or Bi engineer) at Amazon are a great fit for your profile? If so, read Jamie Zhang’s (Sr Business Intelligence Engineer at Amazon) post here

2. What are the 2 Data & Analytics Maturity models that you should absolutely know about?

If you have read my blog, you know that I am a fan of mental models. So, here are 2 mental models (frameworks) shared by Greg Coquillo that are worth reading/digesting here

3. Using Machine Learning to Predict Value of Homes On Airbnb

Really good case study by Airbnb Data scientist Robert Chang here

4. How Netflix measures product succes?

Really good post on how to define metrics to prove or disprove your hypotheses and measure progress in a quick and simple manner. To do this, the author, Gibson Biddle, shares a mechanism of proxy metrics and it’s a really good approach. You can read the post here

Once you read the post above, also suggest learning about leading vs lagging indicators. It’s a similar approach and something that all data teams should strive to build for their customers.

5. Leading vs lagging indicators

Kieran Flanagan and Brian Balfour talk about why your north star metric should be a leading indicator and if it’s not then how to think about it. Read about it here

Thanks for reading! Now it’s your turn: Which article did you love the most and why?

Data Maturity Mental Model Screenshot:

No alternative text description for this image
Source

INSIGHT EXTRACTOR’S DATA ENGINEERING AND SCIENCE NEWSLETTER #3

Standard

The purpose of this newsletter is to promote continuous learning for data science and engineering professionals. To achieve this goal, I’ll be sharing articles across various sources that I found interesting. Following articles made the cut for today’s newsletter:

1.What I love about Scrum for Data Sciene.

I love the Scrum mechanism for all data roles: data engineering, data analytics and data science. The author (Eugene) shares his perspective based on his experiences. I love that the below quote from the blog and you can read the full post here

Better to move in the right direction, albeit slower, than fast on the wrong path.

Source

2. Building Analytics at 500px:

One of the best article on end to end anayltics journey at a startup by Samson Hu. Must read! Go here (Note that the analytics architectures have changed since this post which was published in 2015 but read through the mental model instead of exact tech tools that were mentioned in the post)

3. GO-FAST: The Data Behind Ramadan:

A great example of data storytelling from Go-Jek BI team lead Crysal Widjaja. Read here

4. Why Robinhood uses Airflow:

Airflow is a popular data engineering tool out there and this post provides really good context on it’s benefits and how it stacks up against other tools. Read here

5. Are dashboards dead?

Every new presentation layer format in the data field can lead to experts questioning the value of dashboards. With the rise of Jupyter notebooks, most vendors have now added the “notebooks” functionality and with that comes the follow-up question on if dashboards are dead? Here’s one such article. Read here

I am not still personally convinced that dashboards are “dead” but it should complement other presentation formats that are out there. The post does have good points against dashboards (e.g data is going portrait mode) and you should be aware about those to ensure that you are picking the right format for your customers. The author is also biased since they work for a data vendor that is betting big on notebooks and so you might want to account for that bias while reading this. Also, I had written about “Are dashboards dead?” in context of chat-bots in 2016 and that hypothesis turned out to be true; you can read that here

Image for post
Data is going portrait mode! Source

Thanks for reading! Now it’s your turn: Which article did you love the most and why?

Insight Extractor’s Data Engineering and Science Newsletter #2

Standard

The purpose of this newsletter is to promote continuous learning for data science and engineering professionals. To achieve this goal, I’ll be sharing articles across various sources that I found interesting. Following articles made the cut for today’s newsletter:

  1. Amazing data storytelling example from Ben Evans. Ben starts from a basic premise around “Amazon is not profitable” that a lot of people argue about. He then goes on a data storytelling journey with publicly available data-sets around his chosen premise. Must read! here
  2. What kind of data scientist am I? Elena Greval from Airbnb wrote this excellent article in 2018 but it’s still relevant to understand 3 different flavors of data scientist. Read here
  3. What does it mean to be a data science leader or manager? Eric Weber’s short post on Linkedin on what does it mean to be a leader. IC’s should exhibit these traits for faster career growth especially if you are the sole data person in a decentralized structure. Read here
  4. Functional data engineering: In the blog post here, Maxime Beauchemin explains how to apply functional programming concepts to data engineering.
  5. Interested in growth analytics? Think about this interview question from Andrew Chen: How would you 10x the growth of Product X? LinkedIn post here

Thanks for reading! Now it’s your turn: Which article did you love the most and why?

3 types of data scientist
3 Types of Data Scientist (Source)