As a student preparing for data anaylst & science roles, should I generalize vs specialize?

Standard

This question was posted on Springboard forum.

Here’s my answer:

It depends on your target industry & where they are in their life-cycle.

It has four stages: Startup, Growth, Maturity, Decline.

Industry lifecycle

Generalization is great in earlier stages. If you are targeting jobs at startups; generalize. You should know enough about lot of things.

T-shaped professionals are great for Growth stage. They specialize in something but still know enough about lot of things. E.g. Sr Growth/Marketing Analyst. Know enough about analytics & data science to be dangerous but specializes in marketing.

Specialization is great for mature industries. They know a lot about few things. E.g. Statisticians in an Insurance industry. They have made careers out of building risk models.

Any advice for moving into data science from business intelligence?

Standard

This was asked on Reddit: Any advice for moving into data science from business intelligence?

Here’s my answer:

I come from “Business Intelligence” background and currently work as Sr. Data Scientist. I found that you need two things to transition into data science:

Data Culture: A company where the data culture is such that managers/executives ask big questions that need a data science approach to solve it. If your end-consumers are still asking bunch of “what” questions then your company might NOT be ready for data science. But if your CEO comes to you and says “hey, I got the customer list with the info I asked for but can you help me understand which of these customers might churn next quarter?” — then you have a data science problem at hand. So, try to find companies that have this culture.

Skills: And you need to upgrade your skills to be able to solve data science problems. BI is focused too much on technology and automation and so may need to unlearn few things. For example: Automation is not always important since you might work on problems where a model is needed to predict just a couple of times. Trying to automate wouldn’t be optimal in that case. Also, BI relies heavily on tools but in Data science, you’ll need deeper domain knowledge & problem-solving approach along with technical skills.

Also, I personally moved from BI (as a consultant) -> Analytics (as Analytics Manager) -> Data science (Sr Data Scientist) and this has been super helpful for me. I recommend to transition into Analytics first and then eventually breaking into data science.

Hope that helps!

VIEW THREAD ON REDDIT

In how many dimensions (Vs) is Big Data commonly defined?

Standard

Asked on Quora:

When reading about Big Data, this starts with the definition of Gartner’s analyst Doug Laney (3Vs). IBM is often using 4 dimensions by adding veracity. Some people are using 6 or up to 12 dimensions. I am wondering what’s the most frequently used definition?

Answer:

Here’s my “working” definition of Big Data: if your existing 1) Tools & 2) Processes don’t support the data analysis needs then you have a Big Data problem.

You can add as many V’s as you want to but it all ties back to the notion that you need bigger and better tools and processes to support your data analysis needs as you grow.

Example:

#1. Social Media Data is BIG! It’s Text (variety) and much bigger in size (Volume) and it’s all coming in very fast! (velocity) AND business wants to analyze customer sentiments on social: OK — we have 3V’s problem and need a solution to support this. Maybe Hadoop is the answer. Maybe not. But you do have a “Big Data” problem.

#2: Your Customer Database is broken. They don’t right addresses. Google and Alphabet are showing up as two separate companies when they should be just one. Their employee count is outdated and All of these problems is confusing your business user and they don’t TRUST the data anymore. You have a veracity problem and so you have a BIG Data problem.

Everyone has a BIG DATA problem. It just depends what there “v’s” are AND it most cases “tools” alone will not solve the issue. You need PEOPLE and PROCESS to solve that. Here’s my ranking: 1) PEOPLE 2) PROCESS 3) PLATFORM (tools) for ingredients that are key to solving BIG Data problems.

VIEW QUESTION ON QUORA

How do I learn #SQL for #data analysis?

Standard

Step 1:

This is a good starting point: SQL School Table of Contents

OR, this: Learn SQL

Both of these resources were put together by analytics vendor and is targeted towards beginners.

Step 2:

Review this Quora Thread: How do I learn SQL?

Participate in competitions like this: Solve SQL Code Challenges

Step 3:

If you like to go more in-depth then check out few books:

  1. Head First SQL
  2. Learn SQL the hard Way
  3. Certification books/material from a database vendor

Hope that helps!

VIEW QUESTION ON QUORA

Is the R data science course from datacamp worth the money?

Standard

DataCamp R Data Science

Question (on Quora) Is the R data science course from datacamp worth the money?

Answer:

It depends on your learning style.

If you like watching videos then coursera/udacity might be better.

If you like reading then a book/e-book might be better.

If you like hands-on then something like Data Camp is a great choice. I think they have monthly plans so it’s much cheaper to try them out. When I subscribed to it, it was like 30$/Month or so. I found it was worth it. Also, if you want to see if “hands-on” is how you learn best. Try this: swirl: Learn R, in R. — it’s free! Also, Data Camp has a free course on R too so you could try that as well.

Also, if you want to have free unlimited access for 2-days then try this link: https://www.datacamp.com/invite/G8yVkTrwR3Khn

VIEW QUESTION ON QUORA

Data analytics vs. Data science vs. Business intelligence: what are the key differences/distinctions?

Standard

They are used interchangeably since all of them involve working with data to find actionable insights. But I like to differentiate them based on the type of the question you’re asking:

  • What:

What are my sales number for this quarter?

What is the profit for this year to date?

What are my sales number over the past 6 months?

What did the sales look like same quarter last year?

All of these questions are used to report on facts and tools that help you build data models and reports can be classified as “Business Intelligence” tools.

  • Why:

Why is my sales number higher for this quarter compared to last quarter?

Why are we seeing increase in sales over the past 6 months?

Why are we seeing decrease in profit over the past 6 months?

Why does the profit this quarter less compared to same quarter last year?

All of these questions try to figure why something happened? A data analyst typically takes a stab at this. He might use existing Business Intelligence platform to pull data and/or also merge other data sets. He/she then applies data analysis techniques on the data to answer the “why” question and help business user get to the actionable insight.

  • What’s next:

What will be my sales forecast for next year?

What will be our profit next year for Scenario A, B & C?

Which customers will cancel/churn next quarter?

Which new customers will convert to a high-value customer?

All of these questions try to “predict” what will happen next (based on historical data/patterns). Sometimes, you don’t know the questions in the first place so there’s a lot of pro-active thinking going on and usually a “data scientist” are doing that. Sometimes you start with a high level business problem and form “hypothesis” to drive your analysis. All of these can be classified under “data science”.

Now, as you can see as we progressed from What -> Why -> What’s next, the level of sophistication needed to do the analysis also increased. So you need a combination of people, process and technology platform in an organization to go from having a Business Intelligence maturity all the way to achieving data science capabilities.

Here’s a related blog post that I wrote on this a while back: Business Analytics Continuum: – Insight Extractor – Blog

Data Science

..And you can check out other stuff I write about here: Insight Extractor – Blog – Paras Doshi’s Blog on Analytics, Data Science & Business Intelligence.

VIEW QUESTION ON QUORA

Where can I find a data analyst mentor, be it in-person or online?

Standard

Data Analyst MentorFind a mentor, where do I. Hmmmmmm….

There are few options. 1) Paid online courses with Mentoring 2) Free Options

#1, Paid online courses with mentoring.

I am a mentor for an ed-tech startup Springboard – Learn Data Science & UX Design online — it’s similar to what you are asking for. If you see value in that, you should check it out.

#2. Free options:

a. Quora: You could ask questions here and A2A — Build a network and someone may offer to mentor you offline

b. Mooc: You could join courses on MOOC’s like coursera and udacity — they have good forum support so you could use it for getting your questions answered

c. Cold email: There are lot of analytics/data-science professionals active in the community (linkedin groups, blogs, etc) and if you cold email them, you might find one!

d. local meetups: go to local meetups, meet people and find your mentor.

Stepping back, having a mentor helps and accelerates your progress – but not having one, shouldn’t stop you from achieving what you want.

VIEW QUESTION ON QUORA