Data Engineering and Data Science Newsletter #4

Standard

The purpose of this Insight Extractor’s newsletter is to promote continuous learning for data science and engineering professionals. To achieve this goal, I’ll be sharing articles across various sources that I found interesting. The following articles made the cut for today’s newsletter.

1. What does a Business Intelligence Engineer (BIE) do in Amazon?

Have you wondered what Analytics professionals at Top tech companies work on? Are you job hunting and wondering what data roles (data engineer, data science, or Bi engineer) at Amazon are a great fit for your profile? If so, read Jamie Zhang’s (Sr Business Intelligence Engineer at Amazon) post here

2. What are the 2 Data & Analytics Maturity models that you should absolutely know about?

If you have read my blog, you know that I am a fan of mental models. So, here are 2 mental models (frameworks) shared by Greg Coquillo that are worth reading/digesting here

3. Using Machine Learning to Predict Value of Homes On Airbnb

Really good case study by Airbnb Data scientist Robert Chang here

4. How Netflix measures product succes?

Really good post on how to define metrics to prove or disprove your hypotheses and measure progress in a quick and simple manner. To do this, the author, Gibson Biddle, shares a mechanism of proxy metrics and it’s a really good approach. You can read the post here

Once you read the post above, also suggest learning about leading vs lagging indicators. It’s a similar approach and something that all data teams should strive to build for their customers.

5. Leading vs lagging indicators

Kieran Flanagan and Brian Balfour talk about why your north star metric should be a leading indicator and if it’s not then how to think about it. Read about it here

Thanks for reading! Now it’s your turn: Which article did you love the most and why?

Data Maturity Mental Model Screenshot:

No alternative text description for this image
Source

Insight Extractor’s Data Engineering and Science Newsletter #2

Standard

The purpose of this newsletter is to promote continuous learning for data science and engineering professionals. To achieve this goal, I’ll be sharing articles across various sources that I found interesting. Following articles made the cut for today’s newsletter:

  1. Amazing data storytelling example from Ben Evans. Ben starts from a basic premise around “Amazon is not profitable” that a lot of people argue about. He then goes on a data storytelling journey with publicly available data-sets around his chosen premise. Must read! here
  2. What kind of data scientist am I? Elena Greval from Airbnb wrote this excellent article in 2018 but it’s still relevant to understand 3 different flavors of data scientist. Read here
  3. What does it mean to be a data science leader or manager? Eric Weber’s short post on Linkedin on what does it mean to be a leader. IC’s should exhibit these traits for faster career growth especially if you are the sole data person in a decentralized structure. Read here
  4. Functional data engineering: In the blog post here, Maxime Beauchemin explains how to apply functional programming concepts to data engineering.
  5. Interested in growth analytics? Think about this interview question from Andrew Chen: How would you 10x the growth of Product X? LinkedIn post here

Thanks for reading! Now it’s your turn: Which article did you love the most and why?

3 types of data scientist
3 Types of Data Scientist (Source)

Data Culture Mental Model.

Standard

What is Data Culture?

First, let’s define what is culture: “The set of shared values, goals, and practices that characterizes a group of people” Source

Now building on top of that for defining data culture, What are set of shared values? Decisions will be made based on insights generated through data. And also, group of people represent all decision makers in the organization. So in other words:

An org that has a great data culture will have a group of decision makers that uses data & insights to make decisions.

Why is building data culture important?

There are two ways to make decisions: one that uses data and one that doesn’t. My hypothesis is that decisions made through data are less wrong. To make this happen in your org, you need to have a plan. In the sections below, i’ll share key ingredients and mental model to build a data culture.

What are the ingredients for a successful data culture?

It’s 3 P’s: Platform, Process and People and continuously iterating and improving each of the P’s to improve data culture.

How to build data culture?

Here’s a mental model for a leader within an org:

  1. Understand data needs and prioritize
  2. Hire the right people
  3. Set team goals and define success
  4. Build something that people use
  5. Iterate on the data product and make it better
  6. Launch and communicate broadly
  7. Provide Training & Support
  8. Celebrate wins and communicate progress against goals
  9. Continue to build and identify next set of data needs

Disclaimer: The opinions are my own and don’t represent my employer’s view.

5 stages of Analytical Competition

Standard

I love mental model and frameworks. I have shared some frameworks on this blog already like 3 W’s (What, Why, what’s next) and 3 P’s (Platform, People, Process) focused on helping analytics leader figure what their analytics roadmap should be. I was reading ‘competing on analytics’ book and came across the 5 stages of Analytical competition which seemed like another framework worth sharing.

The two end of the spectrum are org is flying blind to org is competing through analytics. Stages are:

  1. Analytically impaired
  2. Localized Analytics
  3. Analytical aspirations
  4. Analytical companies
  5. Analytical competitor

You can read about each one of these here: Five Stages of Analytic Competition  and you can read a synopsis of the book here.

How Analytics changed Scouting in Soccer

Standard

An interesting video that’s a great reminder on how Analytics is a game-changer when applied correctly. The video shared above how small clubs uses analytics to compete with big clubs and continue to not only stay relevant but grow in the process.

Similar analogy can be drawn for startups (or early-mid stage products inside big companies) where they can use Analytics to compete with incumbents in the market.

Let me know what you think. What’s your favorite analogy to help explain why analytics is useful to your org?

Great example of storytelling through data:

Standard

End of the beginning by Benedict Evans.

Two great posts on DAU/MAU and Measuring Power Users

Standard

Two great posts from Andrew Chen. Links below:

These posts were perfectly timed for me as we started thinking about Annual Planning for Alexa Voice Shopping org (Amazon) this week. As a part of my research of which metrics to use to measure things that our business cares most about and then setting the right benchmarks/goals for the org, the posts below were super helpful. So if you are in tech and if you care about 1) measuring frequency of usage 2) measuring the most engaged cohort then you should take some time to read these posts.

Power user curve 

DAU/MAU is an important metric to measure engagement, but here’s where it fails

Cheers!

Springboard Data Analytics for Business Office Hours

Standard

I was invited to lead the office hours for the Springboard’s Data Analytics for Business course and I wanted to share the recording with you all:

CLICK HERE

I answer following questions during the office hours:

  • What tools have I used in my career for Data Analytics & Data Science?
  • What are the different analysis/modeling that you do?
  • What are the biggest challenges that I found when I got in this Industry?
  • Being data-driven is not binary but it’s a scale — how do you do analyze what is their current level and how do you make a company more data-driven?
  • What is the challenge for newcomers in this industry? And what are the changes coming in next few years?
  • Which tools are widely used today? Which industry uses which tools heavily?
  • How do you verify “what’s next”? How do you verify that your forecast is good enough?

Related Post: $100 Discount Code For Springboard

News: PASS outstanding Volunteer award & stepping down as Business Analytics Virtual Group Co-leader

Standard

I am honored to get the PASS outstanding volunteer award again for June 2017! It’s been so much fun helping grow the chapter from 1K to 10K members within last 4 years — the PASS HQ Team & Dan English (Group Lead) were great to work with and there’s so much more growth left for the next few years! The Group was recently classified as a “tier-1” group and got new sponsors which mean that group has some funding to pursue paid growth opportunities that weren’t accessible before.

Outstanding Volunteer Award PASS

URL: http://www.pass.org/Community/GetInvolved/Volunteers/OutstandingVolunteers.aspx

So since the group has the perfect platform to continue growing and we have a really good process in place to keep our growth flywheel running, I figured it’s a great time to step down. Over the past few years, my career moved me from Business Intelligence -> Analytics -> Data Science and along with that, I have slowly moved away from Microsoft-centric architectures too. I started out working for a Microsoft Gold Partner and then worked for an Open-source heavy shop at a startup-mode organization in silicon valley and now I work in an organization that uses a little bit of everything. Something like best of both worlds — and so there’s a much bigger gap now between where my career is taking me and the mission of the business analytics virtual group. They don’t perfectly align anymore and even though it’s a very rewarding experience, after some reflection, I figured the group deserves a leader whose mission aligns better than mine does.

Thank you PASS for the opportunity!

And there’s an open position for new volunteers on the Virtual group and so if you like to be involved, reach out to Dan English through the group’s website: http://bavc.pass.org/

What is the difference between courses offered by Springboard vs datacamp vs dataquest? Which is better?

Standard

I am a data-camp subscriber + mentor w/ springboard + completed free-content on data-quest so familiar w/ all three products in some way.

You need two things to have a successful career:

  1. Strong Foundation
  2. Continuous learning

Let’s talk about Continuous Learning first:

In a field that’s as dynamic as data science, you should always be learning! It could be through your projects at your work, side-projects or online resources.

I would categorize both data-camp and data-quest under this and are great platforms for continuous learning. I am a subscriber on DataCamp and it’s a great platform to just dive in, do some hands-on exercises and learn something new. I love it! I have heard equally positive things about DataQuest so if you are already working in the Industry as a Data Scientist and just want to get deeper technically, then go for these platforms!

Now Let’s talk about Strong Foundation:

You need a strong foundation to get hired as Data Scientist. You would do that by typically having a relevant college degree. But:

  1. A lot of people don’t have relevant college degrees OR
  2. They graduated a few years back and are looking to do a career transition now OR
  3. They are not willing to go back to do multi-year college programs focused on data science

If that’s the case then there’s a new approach in the market where you attend these “boot camps” — you still need some foundation skills like for example: math/programming/statistics to be eligible for Data science boot camps and if you have those basic skills then you can go through these boot camps. There’s a bunch of them out there. Just search for “data science boot camps”. Springboard is one of them and I have heard nothing but positive things about them — just like I have about DataCamp & DataQuest. I have personally mentored 6 students so far and all them were looking for a career transition and had nothing but positive things to say! That’s just my empirical data though, you should do a trial w/ them and/or check out their job guarantee through their career track if that is important to you. But either ways, it’s a “Bootcamp” offering so it has regular mentor calls/check-in’s, projects, career-coaches, non-technical material like resume tips to give you a structured approach to everything that you might need to get hired as a data scientist — You can expect intense guided learning over a short period of time. The Bootcamp approach is different than self-learning and self-paced approach by DataCamp & DataQuest.

PS: I mentor for Springboard and I have a $750 OFF discount code to share if you decide to enroll. Please contact me through to get the code: Let’s Connect! – Insight Extractor – Blog (If you prefer to not use the referral link, just search for “springboard” and sign up there)

The World is not binary!

I am not saying that you can’t break into Data Science with just DataCamp and DataQuest — you would need to complement it w/ other resources and put more effort to cover everything that you may need. With enough motivation, it could be done for sure! Depending on how fast you want to break into data science + how much time you can invest in figuring out the right resources are two of the biggest factor to determine if you need to go through a Bootcamp.

Conclusion:

If you are already working as a data scientist, DataCamp and DataQuest are great for continuous learning! If you are new to this and don’t have a relevant education background then boot camps like Springboard are a great choice.

Hope that helps!

VIEW QUESTION ON QUORA