Four Tenets for effective Metrics Design

Standard

The goal of this blog post is to provide four tenets for effective metrics design.

Four Tenets for effective Metrics Design

What is a tenet?

Tenet is a principle honored by a group of a people.

Why is effective metrics design important?

Metrics help with business decision-making. Picking the right metric increases the odds of decision making through data vs gut/intuition which can be a difference between success & failure.

Four Tenets for effective metrics design:

  1. We will prioritize quality over quantity of metrics: Prioritizing quality over quantity is important because if there are multiple metrics that teams are tracking then it’s hard for decision-makers to swarm on areas that are most important. Also having multiple metrics decreases the odds of each metric meeting the bar for quality. Now if you have few metrics that are well thought out and meets the other tenets that are listed in the post, it will increase the odds of having a solid data driven culture. I am not being prescriptive with what’s a good number of metrics you should have but you should definitely experiment and figure that out — however, I can give you a range: Anything less than 3 key metrics might be too less and more than 15 is a sign that need to trim down the list.
  2. We will design metrics that are behavior changing (aka actionable): A litmus test for this that ask your business decision-markers to articulate what they will do if the metric 1) goes up N% (let’s say 5%) 2) stays flat 3) goes down N% — they should have a clear answer for at least two out of three scenario’s above and if they can’t map a behavior change or action then this metric is not as important as you think. This is a sign that you can cut this metric from your “must-have” metrics list. This doesn’t mean that you don’t track it but it gives you a framework to prioritize other metrics over this or iterate your metric design till you can define this metric such that it is behavior changing.
  3. We will design metrics that are easy to understand: If your metrics are hard to understand then it’s harder to take actions from it and so it’s a pre-requisite for making your metrics that are behavior changing. Also, other than increasing your odd for the metrics being actionable, you are also making the metric appeal to a wider audience in your teams instead of just focusing on key business decision makers. Having a wide group of people understand your metrics is key to having a solid data driven culture.
  4. We will design metrics that are easy to compare: Metrics that are easy to compare across time-periods, customer segments & other business constructs help make it easy to understand and actionable. For e.g. If I tell you that we have 1000 paying customer last week and this week, that doesn’t give you enough signal whether it’s good or bad. But if I share that last week our conversion rate was 2.3% and this week our conversion rate is 2.1% then you know that something needs to be fixed on your conversion funnel given a 20 bps drop. Note that the ratios/rate are so easy to compare so one tactical tip that I have for you is that to make your metrics easy to compare, see if a ratio/rate makes sense in your case. Also, if your metrics are easy to compare then that increases the odds of it being behavior changing just like what i showed you through the example.

Conclusion:

In this blog post, you learned about effective metric design.

What are your tips for picking good metrics? Would love to hear your thoughts!

Data Culture Mental Model.

Standard

What is Data Culture?

First, let’s define what is culture: “The set of shared values, goals, and practices that characterizes a group of people” Source

Now building on top of that for defining data culture, What are set of shared values? Decisions will be made based on insights generated through data. And also, group of people represent all decision makers in the organization. So in other words:

An org that has a great data culture will have a group of decision makers that uses data & insights to make decisions.

Why is building data culture important?

There are two ways to make decisions: one that uses data and one that doesn’t. My hypothesis is that decisions made through data are less wrong. To make this happen in your org, you need to have a plan. In the sections below, i’ll share key ingredients and mental model to build a data culture.

What are the ingredients for a successful data culture?

It’s 3 P’s: Platform, Process and People and continuously iterating and improving each of the P’s to improve data culture.

How to build data culture?

Here’s a mental model for a leader within an org:

  1. Understand data needs and prioritize
  2. Hire the right people
  3. Set team goals and define success
  4. Build something that people use
  5. Iterate on the data product and make it better
  6. Launch and communicate broadly
  7. Provide Training & Support
  8. Celebrate wins and communicate progress against goals
  9. Continue to build and identify next set of data needs

Disclaimer: The opinions are my own and don’t represent my employer’s view.

How Analytics changed Scouting in Soccer

Standard

An interesting video that’s a great reminder on how Analytics is a game-changer when applied correctly. The video shared above how small clubs uses analytics to compete with big clubs and continue to not only stay relevant but grow in the process.

Similar analogy can be drawn for startups (or early-mid stage products inside big companies) where they can use Analytics to compete with incumbents in the market.

Let me know what you think. What’s your favorite analogy to help explain why analytics is useful to your org?

Can I be a data analyst at a tech company without a degree in computer science?

Standard

Yes — it’s not a must have to work as a Data Analyst. In fact, a lot of people come from a non-CS background and succeed in this role!

Let’s look at the pros and cons of having a computer science (CS) degree and this should help you evaluate where you fall:

Data Analyst computer science degree

Pros of having a CS-degree:

  • If the data analyst position requires you to have this degree in CS then you qualify! Fortunately this is not that common and usually it says bachelor’s required in cs, business administration or related field so as long as you have bachelors for positions that require it then you should be fine
  • you might already have the basic tech skills that are needed for data analysis jobs and the CS degree might be used to validate that.
  • you can pick up new tech concepts and tools fast(er) — with the cs background, it’s easier to pick up new concepts & tools — and you need to continuously do that to stay relevant.

Cons of having a CS-degree:

  • Not enough business problem solving experience and/or lack depth in business knowledge — so if you have a degree in business then you come ahead! Especially if your background aligns with the role. For example: if you focused on Marketing in your bachelors and the role is focused around marketing analytics then you might have an edge
  • I have a CS degree and then I followed it up with a masters from a “business school” — so this is just based on my experience but few CS students (without real world experience) are inclined to focus on “automation” and “bleeding-edge” instead of focusing on what the problem needs. Lot of data analysis doesn’t need to be automated or shouldn’t be automated and not every company needs <<insert the latest tech trend here: big data, deep learning>> — but CS students tend to do that. That’s what they feel most comfortable with so while that doesn’t stop from getting the job, this would impede their growth as a data analyst within the org.

Conclusion:

So as you can see even if you don’t have a CS degree, you can still find roles that align with your other skills and in fact, you might be able to come out ahead if you can prove that you have basic quantitative and tech skills needed to get the job done.

Related: Paras Doshi’s answer to How do I prepare myself for a career in Data Analysis?

VIEW QUESTION ON QUORA

How to create a Histogram in Excel?

Standard

Histogram is a powerful data analysis technique — it let’s you quickly see the distribution of the data you have. So in this post, I am going to list the steps to create histogram in Excel.

It’s a two-step process.

  1. Install “Data Analysis Tool Pak” (free Excel add-in)
  2. Format the data and build the histogram

Step 1: Install Data Analysis Tool Pak.

One of the most useful data analysis add-in in excel is not available by default! It’s called “Analysis ToolPak”

To activate it. Go to File > Excel options > Addins > For the manage field, select Excel add-ins

Histogram Manage Excel add-insMake sure that ToolPak is activated and click OK.

Histogram analysis toolpak excel(Also, Solver is a great add-in as well! It’s not in the scope of this article to discuss that add-in but it’s a powerful add-in as well. For instance, it let’s you work on optimization problems)

Step 2: Format Data and build the Histogram

So now let’s format the data.

You need two things to create a Histogram. 1) Data 2) Range

Here’s an example: (I have about 3000 numbers and need to see the distribution)

You could have other fields on the sheet as well but you need at least the data field. Range is optional but I recommend that you specify the Range so that your histogram would have the bins that you specified — otherwise you could have just used a bar chart!

Note that both of them are numerical.

Data Histogram

Now go to Menu Bar > Data > Data Analysis

Data Analysis HistogramOut of the options available, click on Histogram and select the Input Range and Bin Range > after you’re done, click OK.

Data Analysis Histogram ToolpakYou should see a new worksheet with raw data (ready for charting!). Now, create a Bar chart using the raw data and you have your histogram:

Histogram Excel Data AnalysisConclusion:

In this post I listed the steps you can take to create a Histogram in Excel. Note that there are other options as well — like R (hist function) that let’s you build histogram as well so you do have choice of tools but if you want to stick with excel and it’s good enough then you now know how. Cheers!

Related Post: What is the difference between Histogram & Bar Chart?

Machine Learning Algorithm Cheat Sheet:

Image

If you’re getting started with Data Science & Machine Learning then I think this would be a great resource for you. This “cheat sheet” helps you select the “algorithm” to test depending on the problem you are trying to solve and the data-set that you have.

Download link: http://aka.ms/MLCheatSheet (Courtesy: Azure Machine Learning)

Also, even though the cheat sheet was created to help you with “Azure Machine learning” product, it’s still valid if you use other machine learning tools.

Azure Machine Learning Algorithm Cheat Sheet

 

How to get descriptive statistics in Excel?

Standard

Problem:

you are analyzing a dataset and before modeling/analyzing you need to generate descriptive statistics on a field. you have the data loaded in Excel and wondered if there’s a way to do that in Excel.

Solution:

There’s an out of the box solution that will support your needs to generate descriptive statistics on a field. Here are the steps:

Note: for the purpose of this blog post, I am using Excel 2013 but data analysis toolpak is available in Excel 2007+.

1. Active “Data Analysis” toolpak.

Follow this steps:  File > Options > Add-ins > Manage: Excel Addins > “GO”

excel data analysis toolpak

2. make sure to check the “analysis toolpak” checkbox.

3. Now you should see a “data analysis” option under the “Data” pane:

Excel Data Analysis Descriptive Statistics

4. Now click on “Data Analysis” and select one of the following options:

Anova, Correlation, Covariance, Descriptive Statistics, Exponential Smoothing, F-Test Two-Sample for Variances, Fourier Analysis, Histogram, Moving Average, Random Number Generation, Rank and Percentile, Regression, Sampling, t-Test, z-Test.

in this case, let’s go with descriptive statistics but you can see that you can perform other tasks as well.

5. Once you click on the descriptive statistics, a dialog box will show up and you will have to enter some data like your input range to generate descriptive statistics. Once you have filled the data needed, click on OK and it should generate descriptive statistics for you in EXCEL!

I hope that helps!

Conclusion:

In this post, we saw how to generate descriptive statistics in Microsoft Excel.

Author: Paras Doshi

Back to basics: continuous Vs. Discrete variables and their importance in Data Visualization.

Standard

Take a look at the following chart, do you see any issues with it?

month trend chart line chart string to date

Notice that the month values are shown as “distinct” values instead of shown as a “continuous” values and it misleads the person looking at the chart.  Agree? Great! You already know based on your instincts what continuous and discrete values are, it’s just that we will need to label what you already know.

In the example used above, the “Date & Time” shown as a “Sales Date” is a continuous value since you can’t never say the “Exact” time that the event occurred…1/1/2008 22 hours, 15 minutes, 7 seconds, 5 milliseconds…and it goes on…it’s continuous.

But let’s say you wanted to see Number of Units Sold Vs Product Name. now that’s countable, isn’t it? You can say that we sold 150 units of Product X and 250 units of product Y. In this case, Units sold becomes discrete value.

The chart shown above was treating Sales Date as discrete values and hence causing confusion…let’s fix it since now you the difference between continuous and discrete variables:

Statistics Discrete Continuos Variable Data Visualization

Conclusion:

To develop effective data visualizations, it’s important to understand the data types of your data. In this post, you saw the difference between continuous and discrete variables and their importance in data visualization.

What is the purpose of creating Tables & Graphs?

Standard

Knowing why we do what we do is important. Stephen Few lists four reason for creating Tables & Graphs in his book “Show me the number”. I really liked them so I am posting it here for your reference:

  1. it helps us communicate. It helps present information to others.
  2. it helps us analyze data. it helps us find the insights in the data.
  3. It helps us Monitor Performance. It helps us keep track information about performance e.g. Sales Performance, Speed of Manufacturing, etc.
  4. It helps us Plan. It helps us predict and prepare for the future.

Statistics 101: Nominal, Ordinal, Interval, Ratio Data

Standard

If you work with any statistical analysis tool, sometimes you may have run into configuring the data into either of these following categories: Nominal, Ordinal, Interval, Ratio

Here is what each term means:

NominalSimply names or call them set of charactersExample: Full name, fruits, cars, etc
OrdinalNominal + They have orderExample: Small, medium, big
IntervalOrdinal + the intervals between each value are equally splitExample: temperature in Fahrenheit scale:10 20 30 etc

Note that 20F is not twice as cold as 40F. So multiplication does not make sense on Interval data. But addition and subtraction works. Which brings us to next point: Ratio

RatioInterval + multiplication makes senseWeight: 60KG, 120KG.120 KG = 2 * 60 KG

I hope the examples are of help when you are working with statistical analysis tools and need to categorize the data.