What is Descriptive Analysis?

Standard

I classify analytics into four kinds:

  1. Descriptive — Reporting “What” Happened
  2. Diagnostic — “Why” is Happened
  3. Predictive — What’s going to happen next?
  4. Prescriptive — How can I use all these things to take business decisions/actions.

With that overview, let’s look at Descriptive analysis a little bit more. This is usually the first step for any organization to start getting value out of all their data. They should be able to answer questions like: What were my sales last quarter? How about same quarter last year? Then compare them to see if they made progress. They can also report on sales (Actual vs goals) for last n months and see if they are trending in the right direction. Things like this! Once you have a good process and platform to get this right then the organization is ready to advance to next step which is diagnostic and this is where you start analyzing the key drivers and underlying reasons to figure “why” it’s happening. But you need start at Descriptive!

Hope that helps!

view Question on Quora

Dashboard – Asset management & planning for a global crisis response team:

Standard

Problem:

Asset (Volunteers, Field offices & Equipments) management & planning for a global crisis response team.

Solution:

Working in a team, we created statistical surveys for field works to collect data about current state & estimated future needs. We also helped them with data gathering & cleaning tasks. After that, we helped them analyze & visualize the data to find actions for executives leading the global crisis response team.

Here’s a mockup of one of the ten data visualization created for them:

Asset Management Global crisis response

How to get descriptive statistics in Excel?

Standard

Problem:

you are analyzing a dataset and before modeling/analyzing you need to generate descriptive statistics on a field. you have the data loaded in Excel and wondered if there’s a way to do that in Excel.

Solution:

There’s an out of the box solution that will support your needs to generate descriptive statistics on a field. Here are the steps:

Note: for the purpose of this blog post, I am using Excel 2013 but data analysis toolpak is available in Excel 2007+.

1. Active “Data Analysis” toolpak.

Follow this steps:  File > Options > Add-ins > Manage: Excel Addins > “GO”

excel data analysis toolpak

2. make sure to check the “analysis toolpak” checkbox.

3. Now you should see a “data analysis” option under the “Data” pane:

Excel Data Analysis Descriptive Statistics

4. Now click on “Data Analysis” and select one of the following options:

Anova, Correlation, Covariance, Descriptive Statistics, Exponential Smoothing, F-Test Two-Sample for Variances, Fourier Analysis, Histogram, Moving Average, Random Number Generation, Rank and Percentile, Regression, Sampling, t-Test, z-Test.

in this case, let’s go with descriptive statistics but you can see that you can perform other tasks as well.

5. Once you click on the descriptive statistics, a dialog box will show up and you will have to enter some data like your input range to generate descriptive statistics. Once you have filled the data needed, click on OK and it should generate descriptive statistics for you in EXCEL!

I hope that helps!

Conclusion:

In this post, we saw how to generate descriptive statistics in Microsoft Excel.

Author: Paras Doshi

Back to basics: continuous Vs. Discrete variables and their importance in Data Visualization.

Standard

Take a look at the following chart, do you see any issues with it?

month trend chart line chart string to date

Notice that the month values are shown as “distinct” values instead of shown as a “continuous” values and it misleads the person looking at the chart.  Agree? Great! You already know based on your instincts what continuous and discrete values are, it’s just that we will need to label what you already know.

In the example used above, the “Date & Time” shown as a “Sales Date” is a continuous value since you can’t never say the “Exact” time that the event occurred…1/1/2008 22 hours, 15 minutes, 7 seconds, 5 milliseconds…and it goes on…it’s continuous.

But let’s say you wanted to see Number of Units Sold Vs Product Name. now that’s countable, isn’t it? You can say that we sold 150 units of Product X and 250 units of product Y. In this case, Units sold becomes discrete value.

The chart shown above was treating Sales Date as discrete values and hence causing confusion…let’s fix it since now you the difference between continuous and discrete variables:

Statistics Discrete Continuos Variable Data Visualization

Conclusion:

To develop effective data visualizations, it’s important to understand the data types of your data. In this post, you saw the difference between continuous and discrete variables and their importance in data visualization.

A great example to show power of visualizing data: Anscombe’s Quartet Table

Standard

Let’s look at four datasets which have identical statistical properties:
Here’s the DATA:

ansombe quarter data visualizeHere’s their statistical properties:

Property Value
Mean of x in each case 9 (exact)
Variance of x in each case 11 (exact)
Mean of y in each case 7.50 (to 2 decimal places)
Variance of y in each case 4.122 or 4.127 (to 3 decimal places)
Correlation between x and y in each case 0.816 (to 3 decimal places)
Linear regression line in each case y = 3.00 + 0.500x

They look identical – don’t they? BUT let’s visualize the data:

Anscombe quarter data visualizationOnly visualizing data made it possible for us to understand and appreciate the “difference” between data-sets. Looking at just statistical properties made them appear “similar” – moral of the story: Visualize data! Graph data along with investigating statistical properties.

Source: Anscombe’s quartet

Adding a TrendLine to a Time Series Line Chart in Excel 2010:

Standard

I was playing w/ a time series data set in Excel 2010 and learned how to add a Trend-line and in this blog post, I’ll share how I added it:

First up, How is Trend-line useful? Here are few answers:
– It helps us see how data is changing over time, in other words, it helps us find “trends”
– It helps us forecast future.

With that, here is the chart without Trend-line:on time flight arrivals excel without trendline

Now let’s add the trend-line and you’ll be able to compare on your own how Trend-line makes it easier to spot “trends”. Here are the steps:

1. select the line > right-click > add trend line

add trendline time series

2. configure the trend-line options

trend line configuration options excel

3. I also changed the line style

4. And Here’s the chart w/ trend-line

american airlines on time flight arrivals excel with trendline

Conclusion:

In this post, we saw how to add trend-line in the time series chart in excel 2010

Statistics 101: Nominal, Ordinal, Interval, Ratio Data

Standard

If you work with any statistical analysis tool, sometimes you may have run into configuring the data into either of these following categories: Nominal, Ordinal, Interval, Ratio

Here is what each term means:

Nominal Simply names or call them set of characters Example: Full name, fruits, cars, etc
Ordinal Nominal + They have order Example: Small, medium, big
Interval Ordinal + the intervals between each value are equally split Example: temperature in Fahrenheit scale:10 20 30 etc

Note that 20F is not twice as cold as 40F. So multiplication does not make sense on Interval data. But addition and subtraction works. Which brings us to next point: Ratio

Ratio Interval + multiplication makes sense Weight: 60KG, 120KG.120 KG = 2 * 60 KG

I hope the examples are of help when you are working with statistical analysis tools and need to categorize the data.