[VIDEO] Microsoft’s vision for “Advanced analytics” (presented at #sqlpass summit 2015)


Presented at #sqlpass summit 2015.

#sqlpass webinar: “Data Analytics Explained for Business Leaders” on 1/15


A quick blog post to let you know about a #sqlpass webinar on 1/15.

Data Analytics Explained for Business Leaders

Thu, Jan 15 2015 12:00 (UTC-05:00) Eastern Time (US & Canada)

RSVP: http://bit.ly/PASSBAVC011515


Description: The world is becoming more efficient. Today, seventy percent of the companies that graced the Fortune 1000 list a mere decade ago have vanished. Agility and survival are function of innovation, culture, and the ability to predict the future. To that end, data analytics offers a lifeline, a means of survival that will drive productivity and continue to disrupt and redefine business. However, the resources available to today’s business leaders sit on two vastly different ends of the spectrum. On the one hand, highly technical academic resources and on the other largely fluffy overviews of value propositions and potentials. The state of the industry shouldn’t be surprising. The same dynamics played out in early years of the internet. Software providers, technical leaders, and consulting firms greatly benefit from mystifying the world of data analytics into something that is incomprehensible. That lack of conceptual understanding is incredibly risky and propels the cost of analytics initiatives upwards. This webcast aims to bridge that gap between the technical data scientists and business leaders. Ultimately, this understanding will help to: – Connect the strategic goals of business leaders with the capabilities of technical advisers – Focus investments and initiatives within analytics and technology – Distill immensely complex subject matter into comprehensible examples – Accelerate the path to value and increase the ROI of analytics initiatives

Speaker Bio

Alex is a Predictive Analytics Architect in the Oil and Gas industry with a passion for distilling complexity into insights and evangelizing data science. His work has been featured on KDNuggets and he was recognized by DataScienceCentral as a top 180 blogger in 2014.

RSVP: http://bit.ly/PASSBAVC011515

I hope to see you there!

Examples to help you differentiate between Business Intelligence and Data Science problems:


In this post, I’ll list few examples from various industries to help you differentiate between business intelligence and data science problems.

Sometime back, I blogged about “Business Analytics Continuum” and in the post we saw that Every Organization has DATA but they use their business data at different levels because of their maturity level. Excel (or other transactional reporting tools) is usually the starting point for any organization – it helps them see WHAT happened. They advance to the next stage, where they get capabilities to slice and dice their data – To find out WHY – and usually this capability is delivered using Business Intelligence tools & techniques. Once the data culture spreads – Thanks to a successful Business Intelligence project – then they soon start to outgrow their business intelligence capabilities by asking problems that need predictive capabilities. This is advanced analytics and Data Science stage. To that end, here are 5 examples to help you differentiate between business intelligence and data science problems:

Business Intelligence.(WHAT & WHY) Data Science & advanced analytics.
Bike Rentals
  1. How many bikes did we rent in Q3 2014? How does that compare to Q3 2013?
  2. What is the trend of total bike rentals at week level? Can you break it down by geography?
Can you predict bike rentals on an hourly basis?
Credit Risk
  1. How many customers have a credit risk of ‘C’?
  2. Can you rank customers by their payments due amount that have a credit risk ‘C’?
Can you predict the credit risk of the customer during contract negotiations stage?
Customer relationship management
  1. How many account cancellations occurred this year (broken down by month and customer segmentation)?
  2. How does percentage of account cancellations this year compare to that previous year?
 Can you predict customer churn?
Flight Delays
  1. What is the trend of % of flight delayed this year?
  2. Can you break down flight delays this year by their reasons?
Can you predict whether a scheduled flight will be delayed by more than 15 minutes?
Customer feedback
  1. What is the customer satisfaction % trend this year?
  2. What is the customer satisfaction % broken down by customer segments and product segments?
Can you classify a customer feedback comment into “positive”, “negative” or “neutral”?

I hope this helps!

PASS Business Analytics VC: Insider’s Introduction to Microsoft Azure Machine Learning (#AzureML). #sqlpass


RSVP: http://bit.ly/PASSBAVC091814

Session Abstract:
Microsoft has introduced a new technology for developing analytics applications in the cloud. The presenter has an insider’s perspective, having actively provided feedback to the Microsoft team which has been developing this technology over the past 2 years. This session will 1) provide an introduction to the Azure technology including licensing, 2) provide demos of using R version 3 with AzureML, and 3) provide best practices for developing applications with Azure Machine Learning.
Speaker BIO:
Mark is a consultant who provides enterprise data science analytics advice and solutions. He uses Microsoft Azure Machine Learning, Microsoft SQL Server Data Mining, SAS, SPSS, R, and Hadoop (among other tools). He works with Microsoft Business Intelligence (SSAS, SSIS, SSRS, SharePoint, Power BI, .NET). He is a SQL Server MVP and has a research doctorate (PhD) from Georgia Tech.

RSVP: http://bit.ly/PASSBAVC091814

Hope to see you there!

Paras Doshi
Business Analytics Virtual Chapter’s Co-Leader


Back to basics: Multi Class Classification vs Two class classification.


Classification algorithms are commonly used to build predictive models. Here’s what they do (simplified!):

Machine Learning Predictive Algorithms analytics Introduction

Now, here’s the difference between Multi Class and Two Class:

if your Test Data needs to be classified into two classes then you use a two-class classification model.


1. Is it going to Rain today? YES or NO

2. Will the buyer renew his soon-to-expire subscription? YES or NO

3. What is the sentiment of this text? Positive OR Negative

As you can see from above examples the test data needs to be classified in two classes.

Now, look at example #3 – What is the sentiment of the text? What if you also want an additional class called “neutral” – so now there are three classes and we’ll need to use a multi-class classification model. So, If your test data needs to be classified into more than two classes then you use a multi-class classification model.


1. Sentiment analysis of customer reviews? Positive, Negative, Neutral

2. What is the weather prediction for today? Sunny, Cloudy, Rainy, Snow

I hope the examples helped, so next time you have to choose between multi class and two class classification models, ask yourself – does the problem ask you to predict two classes or more? based on that, you’ll need to pick your model.

Example: Azure Machine Learning (AzureML) studio’s classifier list:

Azure Machine Learning classifiers list

I hope this helps!

Resource: Introduction to Data Science by Prof Bill Howe, UW


Introduction to Data Science course taught by Bill Howe just started on coursera platform. Having studied the Data Intensive Computing in Cloud course at UW taught by Prof Bill Howe, I can say that this course would be great resource too!

Check it out: https://www.coursera.org/course/datasci

Introduction to Data Science

What’s “Naive” about Naive Bayes Machine Learning Algorithm?


In this post, I’ll post what why does the “Naive Bayes machine learning” algo have the word Naive in it?

So here is the short answer:

It “assumes” that the features are independent. (In other words: There’s no relation between the features that are used while building the model)

Let’s go a little deeper:

First up, few basic pointers.

> It’s a machine learning algorithm used for classification

> It’s based on Bayesian Statistics.

> you can read about it here: http://en.wikipedia.org/wiki/Naive_Bayes_classifier

Now, what do you mean when you mean that it is Naive because it assumes that features are independent?

Let’s take an example:

Suppose, you are building a “credit card approval” model based on Income and CreditScore

(SideNote: For those who do not know what is credit score, here you go: http://en.wikipedia.org/wiki/Credit_score_in_the_United_States)

And you have the following columns in the training data (Note: In machine learning, think of this columns as features)

Income CreditScore Approved
High High Yes
High Medium Yes
Low High Yes
Low Low NO

Here the features are Income & CreditScore and the target of the classification model is Approved.

In real world, there’s some relation between “income” and “creditscore”. Agree? Great! But Naive Bayes doesn’t think so. Let me reiterate the point of this blog post and see if it makes more sense now: it assumes that the features are “independent” and that’s why it is Naive!

I hope this helps. your comments are very welcome!