Does data analysis and machine learning go hand-in-hand or are they mutually exclusive activities?

Standard

Originally published on Quora. Link Here

“Machine Learning” is a subset of “Data Analysis” — it’s just one of the activities that you could apply to solve a data analysis problem, you just need to find a problem that can use machine learning wizardry! What kind of activities?, you say — well, to answer that we will need to step back and categorize what problems could be solved by Data Analysis. There are broadly three kinds of problems:

  1. “What” Problems. Few example: What are my sales number for last quarter? Can we compare it to same quarter last year? Now, can we break it down by Regions and Product Categories? — you see all these questions could be answered by a querying your data stores or by your Business Intelligence platform. Yo do NOT need machine learning for this. Moving on…
  2. “Why” Problems: Few example: Why did the customer cancel their contract? Why is the profit in region A declining Quarter over Quarter? You see this is little bit more challenging than “what” questions — you will need to structure the problem and pull data from multiple sources. Why did customer cancel? You may want to look at internal (e.g. customer complaints) and external (e.g. bankruptcy) data. Usually you won’t need to apply Machine Learning here — you might benefit in some cases where you “cluster” all churned customers and see if you can find some patterns but again Machine learning is not you primary tool here. Moving on…
  3. “What’s next” problems: This what you have been waiting for — this is where Machine learning could be applied. Example: Which customer accounts will cancel their account this fiscal year? — This is where you train a machine learning algorithm to predict which customers will churn this year. Note that the work you did for “why” problems where you identified some characteristics of churned customers will still be applicable here — and that brings me to: Most organizations don’t usually jump from “What” to “What’s next” stage — every organization is at a different stage depending on their maturity and you can’t apply machine learning to every data analysis problem. Also, with more and more companies using “data” to gain competitive edge, if you are not using machine learning then chances are high that your competitor is and they may out-compete you and that’s why it’s important to continuously invest and reach the highest level — more and more companies and executives are realizing this and it’s a great thing for the data community!

To conclude: Depending on the analytics maturity of your organization and the business problem at hand, you might have to use Machine learning to solve a data analyis problem…And it never hurts to pick up Machine learning basics along with other data analysis skills that you might have.

Hope that helps.

Why are there so many analytics startups?

Standard

Originally published on Quora: Why are there so many analytics startups?

Question:

Why are there so many analytics startups in the past 2 years?  With Google Analytics getting better every year (for FREE!), what is the value proposition?  I understand the need to augment with some new perspectives such as Clicktale, but I’m not sure I understand the value prop of KissMetrics, SpringMetrics, etc?

Answer:

There are two main reasons:

  1. Features gap between google analytics (free) and google analytics (premium aka 360 now!) — there are a lot companies (esp. with multi-million customers) that want to use premium features but still cant justify the ROI of GA premium. So there are analytics startups out there that try to cater to these “gaps”. Even though GA is improving, there will always be some feature gap(s).
  2. Access to venture capital for these startups — so these startups found a market and they went for it. They also had access to venture capital (easier two years back then it is today!) and it also helped them that “big data” and “data science” was (and still is!) a highly discussed tech topic.

I believe we will see some consolidation in next few years.

How do I prepare myself to be a data analyst?

Standard

Originally published on Quora: How do I prepare myself to be a Data Analyst?

Based on how you are framing your question, it seems that you currently don’t have “Data Analysis” Background but want to build a career in this field. Here are three things you could do:

  1. Learn Tech Skills: You will need technical knowledge to be successful at analyzing data. SQL and Excel are a good starting point. You could do a lot with these tools — then depending on the bandwidth that you might have you could explore R. How do you learn this? Here’s a learning pathway: Learn #Data Analysis online – free curriculum ; Also search for free courses on Coursera or other platforms.
  2. Learn Soft/Business Skills: This is as important as tech skills (if not more!) when it comes to Data Analysis. Finding Insights from your data is half the battle, you will need to put the insights in a context/story and influence business decisions and sometimes influence business change. we know change is always hard! So your soft/business skills will be very important. Also, you will benefit a lot from learning about how to break down problems, communicate your solution by using “business” language vs tech-speak.
  3. Apply them (and keep improving): Now that you have picked up some tech and soft/biz skills, apply them! Get an internship, Help out a non-profit in your free time (Data Kind, Statistics Without borders, Volunteer Match are good resources to find a non-profit) and start applying your skills! It would also help you get some “Real” world experience and applying what you have learned while “learning-on-the-job” is arguably the BEST way to pick something up!

Hope that helps!

“4W” framework for assessing your Analytics Maturity:

Standard

Most organizations could benefit from Analytics but before you set the Analytics road-map for your organization, it’s important to figure out your current stage and then build the road-map to achieve your vision. So how do we figure out the analytics maturity of an organization? Let me share a framework to think about this:

I have blogged about “Business Analytics Continuum” before — it’s a great framework to think about Analytics maturity in an organization — BUT the issue is that it’s harder for business people to remember the stages: Descriptive -> Diagnostics -> Predictive -> Prescriptive — And so there’s a simpler (but equally effective) framework that I have been using over past few months (What -> Why -> What’s next aka “3W” framework). And recently at a Microsoft Analytics conference, I saw this framework with an extra “W” which makes total sense that I liked a lot! So i thought I will share that with you all. So here you go — 4W framework:

Stage 1: What Happened?

Stage 2: Why did it happen?

Stage 3: What will happen?

Stage 4: What should I do?

Analytics Framework What Why Whats Next HOW

Credit: Microsoft Data Insights Summit

I hope the framework as you think about your organization’s analytics vision/road-map and stages that you need to go through to help your org succeed with data!

Recommendations:
Building data driven companies — 3 P’s framework.

As a prospective Data Analyst intern, how do I answer the most challenging data analysis I have done so far?

Standard

My answer on Quora for: As a prospective Data Analyst intern, how do I answer the most challenging data analysis I have done so far?

https://www.quora.com/As-a-prospective-Data-Analyst-intern-how-do-I-answer-the-most-challenging-data-analysis-I-have-done-so-far/answer/Paras-Doshi?srid=uWIN

When I hire for Data Analyst (Jr. or Intern) positions, I look for three things:

1) Analytical mindset:

I would do this by sharing a hypothetical case study and seeing how you go about solving this. I would look for things like: a) Approach: How do you break down the problem? b) Effectiveness: How effectively can go about solving the case. I am NOT looking for the “Right” answer but just want to see how you go about solving the case.

(Search for “Management consulting case studies” — I usually pick a simple case)

2) Communication skills:

This is pretty standard across many roles but it’s important for data analysts to be able to communicate their recommendations/findings to stakeholders.

3) Basic hard/tech skills + Willingness to learn new tech skills:

I would ask you basic tech questions around SQL, Excel OR other “tech skills” that you might have mentioned in your resume. I am not looking for expert-level knowledge but just want to make sure you know things that you have listed on your resume or things that you studied. Also, I would ask you questions that would help me figure out whether you are open to learning new tech skills.

So now that I have shared the framework with you, let me try and answer your question: How do I answer the most challenging data analysis project that I have done?

a. If you had a good approach for your project then It would mean that you know how to break down data analysis problems and solve them. So solving a basic case study shouldn’t be difficult for you and I could check box #1!

b. If you can communicate the “complexity” of the project effectively then I think I would check the box #2: communication skills!

c. Since you solved a challenging project, I assume that you picked up some tech skills (Bonus points if you picked up new tech skills while solving this problem). Just let me know what tech you used to solve the problem so that I can ask questions around that — if you are able to answer them then I would check box #3!

It’s NOT about the challenging project but your learning/takeaways from that project that will be help you the most!

Now, assuming that the interview team think you are a good “culture fit” plus you came out on top compared to other candidates then you will get an offer to join the team as a Data Analyst!

Hope that helps and may the force be with you!

Are Dashboards dead?… #Analytics

Standard

Let’s think through:

Are Dashboards Dead?

With lot of vendors pushing for democratizing analytics (a.k.a self-service), it may seem that the dashboards would soon be dead!

You need two things to make a org data driven. 1) Push 2) Pull.

“Pull”

…is where most of the analytics vendors are focused right more — it’s set of technologies that the business users want. The big idea here is to enable business users to pull whatever data they want, whenever they want without having to wait for Analytics/IT. Note that the business users are doing the heavy-lifting in analysis (of course you need a data platform to enable this but still it’s the business users using the platform and doing their analysis)

“Push”

…is where there are dashboards which are built by central IT/Analytics and are ready to be consumed by the business users. This should be a governed environment where a lot of effort has been invested by Analytics/IT to make that the metrics are standardized & accurate. This is key to making this work — if the metrics on the dashboard are accurate and metrics are standardized then business users would trust these dashboards more than the self-service dashboards. This would also be their one-place to go view all key performance indicators for their org/department and then if they see something “interesting” (or better yet — get an alert!) then they can dive into the self-service environment and do their thing. You see, “Push” strategy is really great at getting the data to all business users and then “pushing” them to do use the self-service analytics platform.

[BTW: Putting bunch of reports in a grid layout is not what I am talking about here. I am limiting my definition of dashboard which have KPI’s and directs users to where they should be focusing on]

(Again, two things to do here to make sure the push strategy succeeds. 1) Having standardized & accurate data = earn trust! 2) Having KPI’s that align with the strategic plan of the org/dept)

Dashboards Push Pull Analytics Strategy

So now having understood what these strategies are let’s take a minute to put them to use to answer the question:

Are Dashboards Dead Yet?

So let’s imagine a scenario where a org does not a Push Strategy. They have implemented a self-service platform and are focused on evolving that. Now there are two problems that they will run into:

  1. For “casual” users — How do they get them the training they need? OR support that they need?
  2. For “power” users — Once they start creating their own calculated metrics then how do they make sure that they are standardized across what other power users are doing? (also, how do they validate if what they are analyzing is accurate?)

You see both of those problems can be partially (if not completely) solved by having Dashboards:

  1. Dashboards are a great way for casual users to look at their KPI and then they can figure out where they would focus on
  2. Also, Dashboards are a great way to provide standardized & accurate metrics so everyone could trust the number that they are looking at
  3. Note that it shouldn’t require you to start from zero! You should be using the data modeling layer built for your self service platform for the dashboards as well

And that’s why I think Dashboards are not dead yet.

PS: You might see some vendors that are pushing for a different approach where the platform would auto-magically go through the data and get you the “insights” — I think it’s a great approach. Usually they would target dashboards but I would argue that they compete more with “Pull” strategies rather than “Push” because now the business user won’t have to explore so many different variables but the platform could do that heavy-lifting and get them quick insights.

Machine Learning Algorithm Cheat Sheet:

Image

If you’re getting started with Data Science & Machine Learning then I think this would be a great resource for you. This “cheat sheet” helps you select the “algorithm” to test depending on the problem you are trying to solve and the data-set that you have.

Download link: http://aka.ms/MLCheatSheet (Courtesy: Azure Machine Learning)

Also, even though the cheat sheet was created to help you with “Azure Machine learning” product, it’s still valid if you use other machine learning tools.

Azure Machine Learning Algorithm Cheat Sheet