Why is “Doing Data Science at Twitter” a great read?
This is an insider’s perspective from someone who is working at a company that I classify as having the highest level of analytics maturity — In other words, Twitter is known to apply knowledge gained from data science into their products and business processes.
It’s also important to recognize that every company is different and the analytics/data-science tools/techniques/processes that would be implemented would also vary based on the analytics maturity — I love that this was one of the key insights shared in this article.
Also, the article talks about two types of data scientists…I thought it was great way to classify them because there’s a lot of confusion in the industry around what a Data scientist does. With that, Here’s the URL:
It’s great to see Insights that data can uncover. I saw a nice insight in a report I read about Analyzing customer acquisition channels for e-commerce sites and in this blog post, I am sharing it with you. So what are the top customer acquisition channels for Commerce sites? The Top channels are Organic Search, Emails & Paid Search.Here’s the report: E-Commerce Customer Acquisition Snapshot
It was not surprising to me to see Organic Search and Emails being among the Top customer acquisition channels but what surprised me was relatively poor performance of social media in acquiring customers. Here’s the chart showing performance of various online channels for acquiring customers:
Note #1: The post is NOT about devaluing the benefits of social media and it comes to down to understanding the goals of having a social media presence in the first place. While computing the ROI of social media, there are other factors like increased brand awareness, customer loyalty to be considered. But I posted this data because it’s a great way to show how data can uncover insights and sometimes it may surprise you
Note #2: The percentage of customers acquired does not add up to 100% for a year because the data does not include things like direct traffic. The author of the report confirmed it over an email w/ me.
That’s about it for this post. Your comments are very welcome!
Data integrity is important especially if critical business decisions are based off on data. To that extent, in this post, I’ll write about five data collection tips to help you have accurate data for “social media analytics”. So here are the tips that are applicable to social media analytics irrespective of the tool you are using:
1. Social Media Platform
Select the right social media platform for capturing data. You do not want to select few such that you miss data.And you do want to select irrelevant social media platforms because if you do, then you’ll introduce noise in the data. Let me take an example. If your project needs to be based on USA only then you do not need to add “sina weibo” (Chinese social network) in your social media sources.
Now, Based on your business need for “social media analytics” campaign, you should test all possible social media platforms – you never know who might be talking about things that you are interested in. After you have selected the right social media platforms for your project, let’s go the next step:
2. “Search Keyword” Selection
Some of the social media platforms let’s you collect data via “search keywords”. Like twitter allows you to collect data via “hashtags” and/or keywords. So if you want to collect data about all social media posts having “american airlines” then you should not collect data using:
AMERICAN OR Airlines:
If you select the above rule, then it will introduce a LOT of noise because we’ll collect data people talking about just “American” PLUS data about people talking about just “airlines”. That’s bad! What you want is rules like these:
1. American AND airlines
2. “American Airlines” (as a phrase)
Now, I can’t stress the importance of selecting the right search keywords enough. Choosing wrong keywords will add noise that would be bad for analytics. So choose keywords such that you are not adding noise as well as not missing on conversations. There’s no secret formula here, continuous improvement is the way to go!
3. Language & country Filtering
Social networks are GLOBAL in nature and so it’s important to filter (or include) based on the project that you’re working on. Not doing so would add noise in your data. And also remember to include country and language because you do not want to miss out on conversations either.
Conclusion:
Three Data Collection Tips for Social media analytics that I shared in this post are:
This is Guest Blog by Jugal Shah. Jugal is pursuing MBA w/ focus on Marketing from a premier university in India. He shares his views on marketing, sales and strategy via his Blog & Facebook.In this post, He briefly comments on “How to measure Social Media Marketing ROI”.
Jugal Shah’s Short post on Measuring Social Media Marketing ROI:
In social media marketing, ROI is not in just monitory terms. So, for social media ROI, my focus would be on 1) to how many people I have reached 2) How many people I have engaged through online activities 3) Becoming a conversation enabler and perception driver
Then focus on
1) how much increased revenue is due to social media reach (you can do this by tracking referred link) 2) How many leads you generated through social media 3) How social media efforts helped to resolve customer query/problems and led to more customer satisfaction (remember customer acquisition cost 10 times more than customer retention cost).
In a nutshell, It’s of utmost important to use Social Media as:
conversation enabler
perception driver
customer retention
Conclusion:
Paras: Jugal, Thanks for this post. I am sure, this short post would be a great food for thought for readers who are interested in Digital Marketing Analytics or analytics in general. Readers, Feel free to reach out to him on his blog and/or Facebook page.
I have been working on creating Dashboards for one of my projects. As a part of the research, I looked at few Dashboards out their on the inter-webs. Here are three of them that I liked:
1. Social Media & Sentiment Analysis:
What I like about this Dashboard is the creative use of Data via Sentiment Analysis:
In this post, I’ll point you to the resource using which you can perform sentiment analysis using LingPipe on a windows OS. Along with that I’ll share couple of issues that I ran into when I was trying to run this demo on a Windows 7:
1. Error: could not find or load the main class PolarityBasic
To solve this error, you’ll need to build the files given under the C:lingpipe-4.1.0demostutorialsentiment – we use ANT for this. Let’s see how to do that:
2. Building sentiment.jar using ant jar
After successfully downloading ant on windows and setting the ANT_HOME variable to c:apache-ant-1.8.4 – I was still getting the error that ant is not a recognized command.
3. In the tutorial they used POLARITY_DIR – I didn’t use that, Instead I just inputted c:review_polarity because that’s where I unzipped the movie review dataset:
Here’s the screenshot about the command that does basic polarity analysis:
This Blog post applies to Microsoft® HDInsight Preview for a windows machine. In this Blog Post, we’ll see how you can browse the HDFS (Hadoop Filesystem)?
1. I am assuming Hadoop Services are working without issues on your machine.
2. Now, Can you see the Hadoop Name Node Status Icon on your desktop? Yes? Great! Open it (via Browser)
3. Here’s what you’ll see:
4. Can you see the “Browse the filesystem” link? click on it. You’ll see:
5. I’ve used the /user/data lately, so Let me browse to see what’s inside this directory:
6. You can also type in the location in the check box that says Goto
7. If you’re on command line, you can do so via the command:
hadoop fs -ls /
And if you want to browse files inside a particular directory:
2. Make sure that the Cluster is up & running! To check this, I click on the “Microsoft HDInsight Dashboard” or open http://localhost:8085/ on my machine
Did you get any “wait for cluster to start..” message? No? Great! Hopefully, all your services are working perfectly and you are good to go now!
3. Let’s start the Hadoop Command Line (can you see the Icon on the Desktop? Yes? Great! Open that!)
4. Here the command to create a directory looks like:
hadoop fs -mkdir /user/data/input
The above command creates /user/data/input
5. Let’s verify that the input directory was created under /user/data
hadoop fs -ls /user/data
Conclusion: In this post, we saw how to create a directory in Hadoop (on windows) file system and also we saw how to list files/directory using the -ls command.
Neologism means The coining or use of new words – And I believe it’s one of the challenge faced by IT professionals. Nowadays, we put our time & energy trying to get head around “new terms/words/trends”.
Let’s take couple of example(s):
Sometime back, we had cloud computing. Nowadays, its Big Data; In my mind – Big Data has been coined to mean following technologies/techniques under different contexts:
Note: The above image is just for illustration purpose. It does not comprehensively cover every technology that is now called “Big Data”. Feel free to point it out if you think I missed something important.
And Neologism is challenge because:
1) Generally, it’s a new trend and there is little to no consensus on what does it “Exactly” mean
2) It means different things in different context
3) Every person can have their own “interpretation” and no one is wrong.
4) It’s a moving ball. The definition used today will change in future. So we always need a “working” definition for these terms.
Now, Don’t get me wrong, It’s fun trying to figure out what does it all mean and trying to gauge whether it matters to me and my organization or not! What do you think – as a Person in Information Technology, do you think that Neologism is one of the challenges faced by us? consider leaving a reply in the comment section!