Tableau: Data Cleaning for Geographic Maps

Standard

Data cleaning is a major part of any analytic’s/data-visualization undertaking. If data cleaning is ignored then it leads to inaccurate data reporting & thus suboptimal business decisions.

To that end, if you create a Tableau’s Geographic map, please check the accuracy of your data by going to:

Menu Bar > Map > Edit Locations

Let me give you some examples:

Now, I have “states/province” as my geographic role for the variable and when I created a geographic map, I created a geographic map it didn’t show any state for New York State! See Before:

data cleaning geogrphic map before

So what did I do?

I navigated to Menu bar > Map > Edit locations:

data cleaning geogrphic map State

So I fixed it!

data cleaning geogrphic map Tableau

And After:

data cleaning geogrphic map after

Note that New York State is lighted up!

In the past, I’ve also have entered Latitude & Longitude if need be.  This is when it was not able to recognize few US cities and it was saying “ambiguous” – I inputted Latitude & Longitude to clean the data:

data cleaning geogrphic map city

Conclusion:

In this post, I described how you should check the data accuracy of a Tableau Geographic Map.

Business Metrics #2 of N: Customer Retention Rate

Standard

In this post, We’ll explore a Business metric called “Customer Retention Rate”

What is it?

It is a metric that helps an organization monitor the % of customers retained.

Let me give you an example:

Year Number of Customers Retention Rate
0 100 100%
1 85 85%
2 70 70%
3 65 65%
4 61 61%

Do you notice the third column that keeps a tab on the percentages of customer retained? This is the basic Idea behind customer retention rate.

How is it used?

This metric correlates with other key business performance measures like: customer service, product quality, customer loyalty. Think about it. If the customer retention rate is higher than the organization must be doing “something” right – that something could be: great loyalty program, great customer service or great product quality! If it’s low then it requires some action from decision makers – they would want to know the reasons so that they could fix the situation.

In earlier post, we talked about Customer Lifetime Value – now higher customer retention rate would also help us have a higher customer lifetime value.

Also it’s important to realize that the cost of acquiring a new customer is typically higher than keeping existing customer – and so organization that sells products/service like to measure the customer retention rate.

Also, if you customer data then you can drill down to find trends in the retention rate. Questions like: Which Age group has the highest retention rate? or which has lower? Retention rate for male customers? And also predicting customer retention rate of a new customer?

Conclusion:

In this post, we learned about a business metric “customer retention rate”.

And as a reminder, This series is meant to understand Business Metrics from Analytics Perspective.

Beginner’s Guide: Sentiment Analysis using Python on Windows

Standard

This is beginner’s guide to sentiment analysis using Python NLTK on windows. We’ll start w/ installing Python and NLTK and then see how to perform sentiment analysis.

Step 1: Install Python & NLTK

I followed the steps listed on http://nltk.org/install.html

1. Search for python 2.7.3 for windows and install it.

2. Search for Python setup Tools for Windows and install it.

3. Install PIP (for win 64 bit), NLTK and PyYAML.

4. Test installation: Start>All Programs>Python27>IDLE, then type import nltk

Now,

5. Also type:

>>> Import random

6. And also install movie_reviews corpus by typing:

>>>nltk.download()

in the new window that opens, install the movie_reviews corpus.

python nltk download data

Step 2: Sentiment Analysis

I followed the code explained in the NLTK book in the section “document classification” in ch 6 learning to classify text. Here is the section: http://nltk.org/book/ch06.html#document-classification

Using the code I was able to run the Naive Bayes Classifier to categorize text:

python sentiment analysis

Conclusion:

In this post, we learned how to perform sentiment analysis using Python on windwos platform. NLTK supports classifiers other than Naive Bayes, and also there are resources that will help  you increase the accuracy of the classifier. And I hope that this post acts as a starting guide for you!

Related articles

Three Data Collection Tips for Social Media Analytics

Standard

Data integrity is important especially if critical business decisions are based off on data. To that extent, in this post, I’ll write about five data collection tips to help you have accurate data for “social media analytics”. So here are the tips that are applicable to social media analytics irrespective of the tool you are using:

1. Social Media Platform

social_media

Select the right social media platform for capturing data. You do not want to select few such that you miss data.And you do want to select irrelevant social media platforms because if you do, then you’ll introduce noise in the data. Let me take an example. If your project needs to be based on USA only then you do not need to add “sina weibo” (Chinese social network) in your social media sources.

Now, Based on your business need for “social media analytics” campaign, you should test all possible social media platforms – you never know who might be talking about things that you are interested in. After you have selected the right social media platforms for your project, let’s go the next step:

2. “Search Keyword” Selection

Some of the social media platforms let’s you collect data via “search keywords”. Like twitter allows you to collect data via “hashtags” and/or keywords. So if you want to collect data about all social media posts having “american airlines” then you should not collect data using:

AMERICAN OR Airlines:

If you select the above rule, then it will introduce a LOT of noise because we’ll collect data people talking about just “American” PLUS data about people talking about just “airlines”. That’s bad!  What you want is rules like these:

1. American AND airlines

2. “American Airlines” (as a phrase)

american airlines social mediaNow, I can’t stress the importance of selecting the right search keywords enough. Choosing wrong keywords will add noise that would be bad for analytics. So choose keywords such that you are not adding noise as well as not missing on conversations. There’s no secret formula here, continuous improvement is the way to go!

3. Language & country Filtering

global-social-network

Social networks are GLOBAL in nature and so it’s important to filter (or include) based on the project that you’re working on. Not doing so would add noise in your data. And also remember to include country and language because you do not want to miss out on conversations either.

Conclusion:

Three Data Collection Tips for Social media analytics that I shared in this post are:

1. Select Right Social Media Platform

2. Select Right search keywords

3. Select Right Country and Language.

Data Reporting ≠ Data Analysis

Standard

One of the key thing I’ve learned is importance of differentiating the concepts of “Data Reporting” and “Data Analysis”. So, let’s first see them visually:

data analysis and data reporting

Here’s the logic for putting Data Reporting INSIDE Data Analysis: if you need to do “analysis” then you need reports. But you do not have to necessarily do data analysis if you want to do data reporting.

From a process standpoint, Here’s how you can visualize Data Reporting and Data Analysis:

data analysis and data reporting process

Let’s thing about this for a moment: Why do we need “analysis”?

We need it because TOOLS are really great at generating data reports. But it requires a HUMAN BRAIN to translate those “data points/reports” into “business insights”. This process of seeing the data points and translating them into business insights is core of what is Data Analysis. Here’s how it looks visually:

Data analysis Data Reporting

Note after performing data analysis, we have information like Trends and Insights, Action items or Recommendations, Estimated impact on business that creates business value.

Conclusion:

Data Reporting ≠ Data Analysis

Guest Blog: How to measure ROI of Social Media Marketing?

Standard

Introduction:

This is Guest Blog by Jugal Shah. Jugal is pursuing MBA w/ focus on Marketing from a premier university in India. He shares his views on marketing, sales and strategy via his Blog & Facebook.In this post, He briefly comments on “How to measure Social Media Marketing ROI”.

Jugal Shah’s Short post on Measuring Social Media Marketing ROI:

In social media marketing, ROI is not in just monitory terms. So, for social media ROI, my focus would be on
1) to how many people I have reached
2) How many people I have engaged through online activities
3) Becoming a conversation enabler and perception driver

Then focus on

1) how much increased revenue is due to social media reach (you can do this by tracking referred link)
2) How many leads you generated through social media
3) How social media efforts helped to resolve customer query/problems and led to more customer satisfaction (remember customer acquisition cost 10 times more than customer retention cost).

In a nutshell, It’s of utmost important to use Social Media as:

  • conversation enabler
  • perception driver
  • customer retention

Conclusion:

Paras: Jugal, Thanks for this post. I am sure, this short post would be a great food for thought for readers who are interested in Digital Marketing Analytics or analytics in general. Readers, Feel free to reach out to him on his blog and/or Facebook page.

Business Analytics Continuum:

Standard

Think of “continuum” as something you start and you never stop improving upon. In my mind, Business Analytics Continuum is continuous investment of resources to take business analytics capabilities to next level. So what are these levels? Douglas McDowell explained about this concept in recent post here – I think it was a great food for thought for me and hence I posting about this particular concept here. 

Here is the visual representation of the concept:

business analytics continuum

And I would encourage you to read the entire post and other posts in the series here: PASS BAC Preview Series: Business Analytics Defined