All things data newsletter #16

Standard

(if this newsletter was forwarded to you then you can subscribe here: https://insightextractor.com/)

The goal of this newsletter is to promote continuous learning for data science and engineering professionals. To achieve this goal, I’ll be sharing articles across various sources that I found interesting. The following 5 articles/videos made the cut for today’s newsletter.

(1) Data & AI landscape 2020

Really good review of the yera 2020 of data & AI landscape. Look at those logos that represent bunch of companies tackling various data and AI challenges — it’s an exciting time to be in data! Read here 

2020 Data and AI Landscape
Image Source

(2) Self-Service Analytics

Tooling is the east part, it’s the follow-up steps needed to truly achieve a culture that is independently data-drive. Read here

(3) What is the difference between data pipeline and ETL?

Really good back-to-basics video on difference between Data pipeline and ETL.

(4) Delivering High Quality Analytics at Netlfix

I loved this video! It talks about how to ensure data quality throughout your data stack.

(5) Introduction of data lakes and analytics on AWS

I have another great Youtube video for you. This one introduces you to various AWS tools on data and analytics.

Thanks for reading! Now it’s your turn: Which article did you love the most and why?

All things Data Newsletter #15 (#dataengineering #datascience #data #analytics)

Standard

(if this newsletter was forwarded to you then you can subscribe here: https://insightextractor.com/)

The goal of this newsletter is to promote continuous learning for data science and engineering professionals. To achieve this goal, I’ll be sharing articles across various sources that I found interesting. The following 5 articles made the cut for today’s newsletter.

(1) Scaling data

Fantastic article by Crystal Widjaja on scaling data. It shares a really good framework for building analytics maturity and how to think about building capabilities to navigate each stage. Must read! Here

three stages.png
Image Source: reforge

(2) Building startup’s data infrastructure in 1-Hour

Good video that touches multiple tools. Watch here: https://www.youtube.com/watch?v=WOSrRTaNIm0 (it’s a little outdated since it was shared in 2019 which is 2 years ago but the architecture is still helpful)

(3) Analytics lesson learned

If you haven’t read lean analytics, I recommed it! After that, you should read this free companion which covers 12 good analytics case studies. Read here

(4) Organizing data teams

How do you organize data teams? completely centralized under a data leader? or do you structure it de-centralized reporting into leaders of business functions? some good thoughts here

Image Source

(5) Metrics layer is a missing piece in modern data stack

This is a good article that encourages you to think about adding metrics layer in your data stack. In the last newseltter, I also shared an Article that talks about Airbbn’s Minerva metrics layer and this article does a good job of providing additional reasons to build something simiar. Read here

Thanks for reading! Now it’s your turn: Which article did you love the most and why?

Springboard Data Analytics for Business Office Hours

Standard

I was invited to lead the office hours for the Springboard’s Data Analytics for Business course and I wanted to share the recording with you all:

CLICK HERE

I answer following questions during the office hours:

  • What tools have I used in my career for Data Analytics & Data Science?
  • What are the different analysis/modeling that you do?
  • What are the biggest challenges that I found when I got in this Industry?
  • Being data-driven is not binary but it’s a scale — how do you do analyze what is their current level and how do you make a company more data-driven?
  • What is the challenge for newcomers in this industry? And what are the changes coming in next few years?
  • Which tools are widely used today? Which industry uses which tools heavily?
  • How do you verify “what’s next”? How do you verify that your forecast is good enough?

Related Post: $100 Discount Code For Springboard

What is the difference between courses offered by Springboard vs datacamp vs dataquest? Which is better?

Standard

I am a data-camp subscriber + mentor w/ springboard + completed free-content on data-quest so familiar w/ all three products in some way.

You need two things to have a successful career:

  1. Strong Foundation
  2. Continuous learning

Let’s talk about Continuous Learning first:

In a field that’s as dynamic as data science, you should always be learning! It could be through your projects at your work, side-projects or online resources.

I would categorize both data-camp and data-quest under this and are great platforms for continuous learning. I am a subscriber on DataCamp and it’s a great platform to just dive in, do some hands-on exercises and learn something new. I love it! I have heard equally positive things about DataQuest so if you are already working in the Industry as a Data Scientist and just want to get deeper technically, then go for these platforms!

Now Let’s talk about Strong Foundation:

You need a strong foundation to get hired as Data Scientist. You would do that by typically having a relevant college degree. But:

  1. A lot of people don’t have relevant college degrees OR
  2. They graduated a few years back and are looking to do a career transition now OR
  3. They are not willing to go back to do multi-year college programs focused on data science

If that’s the case then there’s a new approach in the market where you attend these “boot camps” — you still need some foundation skills like for example: math/programming/statistics to be eligible for Data science boot camps and if you have those basic skills then you can go through these boot camps. There’s a bunch of them out there. Just search for “data science boot camps”. Springboard is one of them and I have heard nothing but positive things about them — just like I have about DataCamp & DataQuest. I have personally mentored 6 students so far and all them were looking for a career transition and had nothing but positive things to say! That’s just my empirical data though, you should do a trial w/ them and/or check out their job guarantee through their career track if that is important to you. But either ways, it’s a “Bootcamp” offering so it has regular mentor calls/check-in’s, projects, career-coaches, non-technical material like resume tips to give you a structured approach to everything that you might need to get hired as a data scientist — You can expect intense guided learning over a short period of time. The Bootcamp approach is different than self-learning and self-paced approach by DataCamp & DataQuest.

PS: I mentor for Springboard and I have a $750 OFF discount code to share if you decide to enroll. Please contact me through to get the code: Let’s Connect! – Insight Extractor – Blog (If you prefer to not use the referral link, just search for “springboard” and sign up there)

The World is not binary!

I am not saying that you can’t break into Data Science with just DataCamp and DataQuest — you would need to complement it w/ other resources and put more effort to cover everything that you may need. With enough motivation, it could be done for sure! Depending on how fast you want to break into data science + how much time you can invest in figuring out the right resources are two of the biggest factor to determine if you need to go through a Bootcamp.

Conclusion:

If you are already working as a data scientist, DataCamp and DataQuest are great for continuous learning! If you are new to this and don’t have a relevant education background then boot camps like Springboard are a great choice.

Hope that helps!

VIEW QUESTION ON QUORA

What are the must-know software skills for a career in data analytics after an MBA?

Standard

SQL, Excel & Tableau-like tools are good enough to start. Then add something like R eventually. And then there are tools that are specific to the industry – example: Google Analytics for the tech industry.

Other than that, you should know what do with these tools. You need to know following concepts and continuously build upon that as the industry use-cases and needs evolve:

  1. Spreadsheet modeling
  2. Forecasting
  3. Customer Segmentation
  4. Root cause Analysis
  5. Data Visualization and Dash-boarding
  6. Customer Lifetime value
  7. A/B testing
  8. Web Analytics

VIEW QUESTION ON QUORA

Is it too late to become a good Data Scientist?

Standard

If you’re looking for career change, that’s never too late!

If you’re looking to learn something new, that’s never too late!

If you’re looking to continue learning and go deeper in data science, that’s never too late!

If you don’t like Software engineering and want to switch to something else, that’s never too late!

But if you are after the “Data Science” gold rush, then you did miss the first wave! You are late.

But seriously, you should apply first-principles thinking to your career strategy and ideally not jump to whatever’s “hot” because by the time you get on that train, it’s usually too late.

VIEW QUESTION ON QUORA

Data analytics vs. Data science vs. Business intelligence: what are the key differences/distinctions?

Standard

They are used interchangeably since all of them involve working with data to find actionable insights. But I like to differentiate them based on the type of the question you’re asking:

  • What:

What are my sales number for this quarter?

What is the profit for this year to date?

What are my sales number over the past 6 months?

What did the sales look like same quarter last year?

All of these questions are used to report on facts and tools that help you build data models and reports can be classified as “Business Intelligence” tools.

  • Why:

Why is my sales number higher for this quarter compared to last quarter?

Why are we seeing increase in sales over the past 6 months?

Why are we seeing decrease in profit over the past 6 months?

Why does the profit this quarter less compared to same quarter last year?

All of these questions try to figure why something happened? A data analyst typically takes a stab at this. He might use existing Business Intelligence platform to pull data and/or also merge other data sets. He/she then applies data analysis techniques on the data to answer the “why” question and help business user get to the actionable insight.

  • What’s next:

What will be my sales forecast for next year?

What will be our profit next year for Scenario A, B & C?

Which customers will cancel/churn next quarter?

Which new customers will convert to a high-value customer?

All of these questions try to “predict” what will happen next (based on historical data/patterns). Sometimes, you don’t know the questions in the first place so there’s a lot of pro-active thinking going on and usually a “data scientist” are doing that. Sometimes you start with a high level business problem and form “hypothesis” to drive your analysis. All of these can be classified under “data science”.

Now, as you can see as we progressed from What -> Why -> What’s next, the level of sophistication needed to do the analysis also increased. So you need a combination of people, process and technology platform in an organization to go from having a Business Intelligence maturity all the way to achieving data science capabilities.

Here’s a related blog post that I wrote on this a while back: Business Analytics Continuum: – Insight Extractor – Blog

Data Science

..And you can check out other stuff I write about here: Insight Extractor – Blog – Paras Doshi’s Blog on Analytics, Data Science & Business Intelligence.

VIEW QUESTION ON QUORA

Cheat Sheet to Pick the right graph or chart for your data:

Standard

I have two resources that I use sometimes to pick the right graph or chart for data visualization.

#1: Chart Suggestions:

chart data

#2: Online Tool

(By Juice Labs)

chart pick choose online tool

What are some of the most important resources a Data analyst needs to know about?

Standard

This question was asked on Quora and here’s my answer:

I will list resources broken down by three categories.

  1. Business Knowledge: As a data analyst, you need to have at least basic knowledge of business areas that you are helping with. For example: if you are doing Marketing Analytics then you need to understand basic concepts in marketing and that will make you more effective. You can do so one of the three ways:
    • On-the-job: Pick up knowledge by interacting with business people and using internal knowledge bases.
    • Online resources: Pick up basics of marketing by taking a beginners course online on a platform like Coursera OR from resources like this: Business Concepts – Bootcamp | PrepLounge.com
    • College/University: If you are at a college/university then you can either audit a course or depending on your major/minor, core business courses might just be part of the curriculum
  2. Communication skills:
    • Public Speaking: Toastmaster’s is a great resource. if you don’t have access to a local Toastmasters club, you should be able to find a course online. Check out Coursera.
    • Data Storytelling: Just listening to someone like Hans Rosling can be very inspiring! The best stats you’ve ever seen . Also, If you search storytelling with data on YouTube, you will see few good talks: storytelling with data – YouTube
    • Problem structuring: If you are able to break down the problem into core components to identify root cause, you will not only increase your speed to insight but your structure will also help you communicate it more effectively. Learn to break down your problems and use that in communicating your data analysis approach. Imagine this list without the three high-level categories — wouldn’t it look like I am throwing random resources at you? By giving it a structure — Tech, Biz, Communication, I am not only able to structure it but also communicate it to you more effectively. More here: Structure your Thoughts – Bootcamp | PrepLounge.com
  3. Tech skills: Read Akash Dugam’s answer: Akash Dugam’s answer to What are some of the most important resources a Data analyst needs to know about? — it’s a nice list. Also, check this out: Learn #Data Analysis online – free curriculum

A great data analyst will focus on all areas and a good data analyst might just focus on tech. Hope that helps!

VIEW QUESTION ON QUORA

Book Giveaway: Head First Data Analysis — Ends 07/22/2016

Standard

<< THIS GIVEAWAY IS CLOSED NOW! Thanks for Participating! >>

Head First Data Analysis

Book Giveaway: Head First Data Analysis — A learner’s guide to big numbers, statistics and good decisions!

I love Head First series — if you haven’t read one of these books, you should — it’s great! So when I learned that they had a Data Analysis one, I had to read it. So I bought one and skimmed through it.

Now, Instead of letting it sit on my shelf, I think it might better serve its life purpose if more people read it so I have decided to do this little experiment.

Rules:

  1. You need to have an US-based address so that I can ship it to you (no cost to you!)
  2. You need to comment on this blog post on or before 07/22/2016 — just put your name & email. I’ll contact you if you win*

*Random selection!

Go!