Business Analytics Continuum: Descriptive, Diagnostic, Predictive, Prescriptive

Standard

Think of “continuum” as something you start and you never stop improving upon. In my mind, Business Analytics Continuum is continuous investment of resources to take business analytics capabilities to next level. So what are these levels? 

Here are the visual representation of the concept:

business analytics continuum

Four Tenets for effective Metrics Design

Standard

The goal of this blog post is to provide four tenets for effective metrics design.

Four Tenets for effective Metrics Design

What is a tenet?

Tenet is a principle honored by a group of a people.

Why is effective metrics design important?

Metrics help with business decision-making. Picking the right metric increases the odds of decision making through data vs gut/intuition which can be a difference between success & failure.

Four Tenets for effective metrics design:

  1. We will prioritize quality over quantity of metrics: Prioritizing quality over quantity is important because if there are multiple metrics that teams are tracking then it’s hard for decision-makers to swarm on areas that are most important. Also having multiple metrics decreases the odds of each metric meeting the bar for quality. Now if you have few metrics that are well thought out and meets the other tenets that are listed in the post, it will increase the odds of having a solid data driven culture. I am not being prescriptive with what’s a good number of metrics you should have but you should definitely experiment and figure that out — however, I can give you a range: Anything less than 3 key metrics might be too less and more than 15 is a sign that need to trim down the list.
  2. We will design metrics that are behavior changing (aka actionable): A litmus test for this that ask your business decision-markers to articulate what they will do if the metric 1) goes up N% (let’s say 5%) 2) stays flat 3) goes down N% — they should have a clear answer for at least two out of three scenario’s above and if they can’t map a behavior change or action then this metric is not as important as you think. This is a sign that you can cut this metric from your “must-have” metrics list. This doesn’t mean that you don’t track it but it gives you a framework to prioritize other metrics over this or iterate your metric design till you can define this metric such that it is behavior changing.
  3. We will design metrics that are easy to understand: If your metrics are hard to understand then it’s harder to take actions from it and so it’s a pre-requisite for making your metrics that are behavior changing. Also, other than increasing your odd for the metrics being actionable, you are also making the metric appeal to a wider audience in your teams instead of just focusing on key business decision makers. Having a wide group of people understand your metrics is key to having a solid data driven culture.
  4. We will design metrics that are easy to compare: Metrics that are easy to compare across time-periods, customer segments & other business constructs help make it easy to understand and actionable. For e.g. If I tell you that we have 1000 paying customer last week and this week, that doesn’t give you enough signal whether it’s good or bad. But if I share that last week our conversion rate was 2.3% and this week our conversion rate is 2.1% then you know that something needs to be fixed on your conversion funnel given a 20 bps drop. Note that the ratios/rate are so easy to compare so one tactical tip that I have for you is that to make your metrics easy to compare, see if a ratio/rate makes sense in your case. Also, if your metrics are easy to compare then that increases the odds of it being behavior changing just like what i showed you through the example.

Conclusion:

In this blog post, you learned about effective metric design.

What are your tips for picking good metrics? Would love to hear your thoughts!

[Career Advice] What are the downsides of working as a data scientist in Silicon Valley?

Standard

There are unique challenges to tech roles in Silicon Valley like housing costs & commute times but enough opportunity that it can make up for it if you prepare well. But this isn’t unique to data scientists so parking common challenges aside, here’s what I think is downside of working as a data scientist in silicon valley.

You see, every company follows the curve to reach an Analytics Maturity where after which a data scientist can start adding enormous value. I call it 3W curve.

What -> Why -> What’s Next.

What stage is a company in early analytics maturity stage where they are answering what questions. E.g. what are my sales for 2018? Here a Business Intelligence and Data engineer could help.

Why stage is a company in mid analytics stage where they are asking why questions. E.g. why did our sales go up in Q3 of 2018 compared to Q2 of 2018? Here a business analyst or product analyst can help.

After these two stages, company reaches the third stage where they ask what’s next questions. E.g. What is going to my top product growth area for next quarter? This is something that a data scientist could help with.

Now, having said that, Silicon valley has a lot of companies that are in early to mid stages and are better suited for Data engineers, Business intelligence engineers and Business/Product Analysts but they end up recruiting for “Data Scientists” (since it’s the sexiest term for all things data these days!) — this creates a mismatch in reality and expectation. The data scientist is expected to work on “advanced” analytics topics for a company where the culture and tooling is “basic”. This is a recipe for failure.

This delta is expectation vs reality is the biggest downside of working as a data scientist in silicon valley. To bridge this gap, hiring managers need to think through what their needs are and hire according to the needs (instead of hype) and the candidates should ask probing questions during interview process to judge the analytics maturity of the company to make sure it’s a great fit for them.

Also, I am not saying this delta doesn’t exist in other cities, it’s just that during my time in silicon valley, I noticed it more than I did in other cities. Silicon valley is a leader in tech so if this is fixed here then I expect other cities to follow the path.

originally answered on quora: https://www.quora.com/What-are-the-downsides-of-working-as-a-data-scientist-in-Silicon-Valley/answer/Paras-Doshi#

Great example of storytelling through data:

Standard

End of the beginning by Benedict Evans.

Two great posts on DAU/MAU and Measuring Power Users

Standard

Two great posts from Andrew Chen. Links below:

These posts were perfectly timed for me as we started thinking about Annual Planning for Alexa Voice Shopping org (Amazon) this week. As a part of my research of which metrics to use to measure things that our business cares most about and then setting the right benchmarks/goals for the org, the posts below were super helpful. So if you are in tech and if you care about 1) measuring frequency of usage 2) measuring the most engaged cohort then you should take some time to read these posts.

Power user curve 

DAU/MAU is an important metric to measure engagement, but here’s where it fails

Cheers!

What is the difference between courses offered by Springboard vs datacamp vs dataquest? Which is better?

Standard

I am a data-camp subscriber + mentor w/ springboard + completed free-content on data-quest so familiar w/ all three products in some way.

You need two things to have a successful career:

  1. Strong Foundation
  2. Continuous learning

Let’s talk about Continuous Learning first:

In a field that’s as dynamic as data science, you should always be learning! It could be through your projects at your work, side-projects or online resources.

I would categorize both data-camp and data-quest under this and are great platforms for continuous learning. I am a subscriber on DataCamp and it’s a great platform to just dive in, do some hands-on exercises and learn something new. I love it! I have heard equally positive things about DataQuest so if you are already working in the Industry as a Data Scientist and just want to get deeper technically, then go for these platforms!

Now Let’s talk about Strong Foundation:

You need a strong foundation to get hired as Data Scientist. You would do that by typically having a relevant college degree. But:

  1. A lot of people don’t have relevant college degrees OR
  2. They graduated a few years back and are looking to do a career transition now OR
  3. They are not willing to go back to do multi-year college programs focused on data science

If that’s the case then there’s a new approach in the market where you attend these “boot camps” — you still need some foundation skills like for example: math/programming/statistics to be eligible for Data science boot camps and if you have those basic skills then you can go through these boot camps. There’s a bunch of them out there. Just search for “data science boot camps”. Springboard is one of them and I have heard nothing but positive things about them — just like I have about DataCamp & DataQuest. I have personally mentored 6 students so far and all them were looking for a career transition and had nothing but positive things to say! That’s just my empirical data though, you should do a trial w/ them and/or check out their job guarantee through their career track if that is important to you. But either ways, it’s a “Bootcamp” offering so it has regular mentor calls/check-in’s, projects, career-coaches, non-technical material like resume tips to give you a structured approach to everything that you might need to get hired as a data scientist — You can expect intense guided learning over a short period of time. The Bootcamp approach is different than self-learning and self-paced approach by DataCamp & DataQuest.

PS: I mentor for Springboard and I have a $750 OFF discount code to share if you decide to enroll. Please contact me through to get the code: Let’s Connect! – Insight Extractor – Blog (If you prefer to not use the referral link, just search for “springboard” and sign up there)

The World is not binary!

I am not saying that you can’t break into Data Science with just DataCamp and DataQuest — you would need to complement it w/ other resources and put more effort to cover everything that you may need. With enough motivation, it could be done for sure! Depending on how fast you want to break into data science + how much time you can invest in figuring out the right resources are two of the biggest factor to determine if you need to go through a Bootcamp.

Conclusion:

If you are already working as a data scientist, DataCamp and DataQuest are great for continuous learning! If you are new to this and don’t have a relevant education background then boot camps like Springboard are a great choice.

Hope that helps!

VIEW QUESTION ON QUORA

What are the must-know software skills for a career in data analytics after an MBA?

Standard

SQL, Excel & Tableau-like tools are good enough to start. Then add something like R eventually. And then there are tools that are specific to the industry – example: Google Analytics for the tech industry.

Other than that, you should know what do with these tools. You need to know following concepts and continuously build upon that as the industry use-cases and needs evolve:

  1. Spreadsheet modeling
  2. Forecasting
  3. Customer Segmentation
  4. Root cause Analysis
  5. Data Visualization and Dash-boarding
  6. Customer Lifetime value
  7. A/B testing
  8. Web Analytics

VIEW QUESTION ON QUORA

Any advice for moving into data science from business intelligence?

Standard

This was asked on Reddit: Any advice for moving into data science from business intelligence?

Here’s my answer:

I come from “Business Intelligence” background and currently work as Sr. Data Scientist. I found that you need two things to transition into data science:

Data Culture: A company where the data culture is such that managers/executives ask big questions that need a data science approach to solve it. If your end-consumers are still asking bunch of “what” questions then your company might NOT be ready for data science. But if your CEO comes to you and says “hey, I got the customer list with the info I asked for but can you help me understand which of these customers might churn next quarter?” — then you have a data science problem at hand. So, try to find companies that have this culture.

Skills: And you need to upgrade your skills to be able to solve data science problems. BI is focused too much on technology and automation and so may need to unlearn few things. For example: Automation is not always important since you might work on problems where a model is needed to predict just a couple of times. Trying to automate wouldn’t be optimal in that case. Also, BI relies heavily on tools but in Data science, you’ll need deeper domain knowledge & problem-solving approach along with technical skills.

Also, I personally moved from BI (as a consultant) -> Analytics (as Analytics Manager) -> Data science (Sr Data Scientist) and this has been super helpful for me. I recommend to transition into Analytics first and then eventually breaking into data science.

Hope that helps!

VIEW THREAD ON REDDIT

Looker vs Tableau: How would you compare them in terms of price & capabilities?

Standard

[Update 6/10/2019: Looker has been acquired by Google and Tableau has been acquired by Salesforce]

Someone asked this on quora so here’s my response: This is a great question — one that I figured it out when I led Analytics at Kiva[.]Org last year so I am happy to add my perspective here on Looker vs Tableau:

Looker vs Tableau

Let’s talk about capabilities first and then price.

Capabilities — Looker vs Tableau

Even though both of these tools are classified under Business Intelligence, they have some pretty clear product differentiation so in this section, I will share that. I will share the three main components of Business Intelligence platforms and then map it back to core strengths of each product.

Business Intelligence platforms typically has three main components:

  1. Data Collection, Storage & Access
  2. Data Modeling
  3. Data Visualization

#1) Data Collection, Storage & Access: Both of these tools don’t do data collection & storage. You will need infrastructure to collect data and store it — typically it is stored in databases. And you can access data from databases using SQL. You will need to connect to these data sources from either of these tools and access data — Note that: On the surface, it might look like Tableau supports more data sources than Looker but there might be workaround to get your data into one of the data sources supported by Looker and take it from there and so I am not awarding extra points to Tableau for this. Also, I am personally a big proponent of using Analytic databases like redshift, vertica, bigquery & Azure DW for Analytics applications which Looker & Tableau both support so calling it a tie here!

#2) Data Modeling: This is Looker’s core strength by a wide margin! Why? This is because of their LookML which is their data modeling layer and I am super impressed by this after using it for a while now! So let’s chat about what data modeling layer means and why you should care.

Data modeling (in this context) means creating data models that take your raw data as input and then it’s cleaned, combined, curated & converted and made ready for data analysis.

Why is this important? Not everyone can clean, curate, combine & convert raw data into analysis-friendly data assets. That’s what data analysts are trained and specialize in. May in the future we will have tools that do that OR maybe we will see plug-and-play (aka turnkey) solutions for few key analysis needs but for now, you need data analysts that can create these data models.

Now there are two ways to create data models:

You can create them on-the-fly (ad-hoc) OR you can publish all of these data models on a platform (like Looker).

There are all sorts of issues with doing it on-the-fly — it works for small teams (<20–30 people) but more than that you need to have some process in place. For instance: You can’t automate data models that you need often so that’s wasted time, Also, you can’t share these models easily with others, creates a single point of failure and if the analyst person is sick or on vacation then no-one gets “insights” from data — the world stops spinning. Yada Yada Yada…So self-service is good after you have few business users who want to consume data.

So what does a self-service platform bring to the table? They help data analyst build these wonderful data-analysis friendly models and publish them so everyone who cares in an org can access it. So the consumer can focus on analysis part and not worry about doing the not-so-good part of making it ready for analysis. Also, this ensures all sorts of other benefits: standardized metric definitions, trusted data sources, better collaboration among analysts, speedier model-delivery process, get out of excel hell and what not!

Think of this way: If you have all key data model available on your self-service platform then your data analyst can focus on 1) advance stuff = more $$$ 2) building more data models (and so eventually they can do more advanced stuff later and more $$$!)

Looker vs Tableau

This is where Looker fits in. Looker is great at this data modeling thing — it’s platform is amazing for anyone looking to solve this problem. You can also do data visualization on top and build dashboards.

Alright, moving on:

#3) Data Visualization: This is Tableau’s forte! No one does data viz better than Tableau, at least right now. There are vendors that are investing significant resources on this and they are close but still Tableau is a leader in this space.

Having said that, let’s map it back to how it help business users & analysts:

Business users and self-service environments:

Tableau is not great at data modeling thing. Yes, you can do basic clean, combine, curate & convert thingy but it doesn’t work well with intermediate to advanced needs. So if you have a self-service data modeling layer already in place that Tableau can connect to and you are looking for a data visualization layer then go for Tableau! You would be able to create some amazing visuals, dashboards and stories that will WOW your business users! But to make sure this scales you need to seriously think about 1) how to overcome the limitations in tableau’s data modeling layer OR 2) use some other tool to build this data modeling layer and connect Tableau to it.

Pro Tip: I highly recommend trying out trials of these products and seeing what works best!

Analysts:

Tableau shines at data discovery! While this certainly helps business users, it’s best leveraged by analyst because whenever they are working on ad-hoc data analysis (one-time, strategic in nature) projects they can be much more effective and discover the underlying trends and patterns in their data by visualizing it using Tableau.

So with that context you might be wondering, What tool did I champion & Implement at Kiva?

This is public knowledge that Kiva is a Looker customer because it’s Logo is on their website so I can share this.

After evaluating about 30+ tools (including Tableau), I ended up championing and eventually leading the initial implementation sprints to implement Looker at Kiva because the goals & vision that we had for Kiva’s data & analytics platform aligned better with having the data modeling layer that met Kiva’s needs. So you need to figure out your goals and vision and then choose the tools with that framework.

Pro Tip #1: It’s insanely hard to figure out what your goals and vision for analytics in an org. To figure this out, you might want to chat with organizations in the same industry at the same size & stage and see what they use. Ask them about what they use and whether it worked for them. Ask them about their Return on investment. This is a great way to get external feedback but you still need to figure out internal needs and prioritize them.

Pro Tip #2: Both of these tools have amazing reviews! You will see them highly ranked in analyst reports too — this is great but it’s important that ever before to clearly define what your organization needs and then map it back to the core strengths of these products (or any other tool for that matter) and go from there!

[I am happy to help evaluate the right tool for you needs, feel free to contact me: Let’s Connect! – Insight Extractor – Blog ]

Pricing — Looker vs Tableau

I can’t talk about Looker’s pricing because it’s not public, I apologize! You need to contact them to get the quote.

However, you can anchor that with Tableau’s pricing which is public: Buy Tableau | Tableau Webstore

Your analyst and power users will need Tableau Desktop/Professional which is $1K and $2K respectively (one-time thing) and then depending on your deployment model: cloud or self-hosted — the price varies:

Looker Tableau Pricing

*Note that Tableau online is a subscription model so you can definitely start small. Let’s say 5 business users in a department and take it from there. If you grow then you can later look at other tools like Looker. (If you are rapidly growing, account for the non-trivial time needed to migrate from one platform to another and so it might be worth it to pick the right tool from the get-go)

Pro Tip: I will encourage you to think about building a ROI model too. You know use some analytics for your analytics projects 😉 — I apologize, couldn’t resist! Anyhow, the point is that instead of just thinking about the “cost”, think about the value-add and anchor your investment figure to that. There’s a reason some analytics tool are priced at let’s say $1000 vs some tools priced at $100,000 — both of them have different value proposition and if you know how to extract value of the tool and can project it then you can get better ROI!

Hope that helps! If I can be of any further help, email me or comment here! Let’s Connect! – Insight Extractor – Blog

VIEW QUESTION ON QUORA

How do I pursue career in data warehousing?

Standard

Someone asked this on quora, and here’s my reply:

In the data world there are two broad sets of jobs available:

  1. Engineering-oriented: Date engineers, Data Warehousing specialists, Big Data engineer, Business Intelligence engineer— all of these roles are focused on building that data pipeline using code/tools to get the data in some centralized location
  2. Business-oriented: Data Analyst, Data scientist — all of these roles involve using data (from those centralized sources) and helping business leaders make better decisions. *

*smaller companies (or startups) tend to have roles where small teams(or just one person) do it all so the distinction is not that apparent.

So, it seems like you are interested in engineering-oriented roles — the role that focused on building data pipelines. Since you are starting out, I would suggest that you broaden the scope to learn about other tools as well. While data warehousing is still relevant and will be in some form or another for next few years, Industry (especially tech companies) have been slowly moving towards Big Data technologies and you need to be able to adapt to these changes. So learn about data warehousing, may be get a job/internship as a ETL/BI engineer but keep an eye out on other data engineering related tools like Hadoop ecosystem, spark, python, etc.

VIEW QUESTION ON QUORA