Machine Learning Algorithm Cheat Sheet:

Image

If you’re getting started with Data Science & Machine Learning then I think this would be a great resource for you. This “cheat sheet” helps you select the “algorithm” to test depending on the problem you are trying to solve and the data-set that you have.

Download link: http://aka.ms/MLCheatSheet (Courtesy: Azure Machine Learning)

Also, even though the cheat sheet was created to help you with “Azure Machine learning” product, it’s still valid if you use other machine learning tools.

Azure Machine Learning Algorithm Cheat Sheet

 

Doing Data Science at Twitter — A great read!

Link
Doing Data science at Twitter

Doing Data science at Twitter

Why is “Doing Data Science at Twitter” a great read?

This is an insider’s perspective from someone who is working at a company that I classify as having the highest level of analytics maturity — In other words, Twitter is known to apply knowledge gained from data science into their products and business processes.

It’s also important to recognize that every company is different and the analytics/data-science tools/techniques/processes that would be implemented would also vary based on the analytics maturity — I love that this was one of the key insights shared in this article.

Also, the article talks about two types of data scientists…I thought it was great way to classify them because there’s a lot of confusion in the industry around what a Data scientist does. With that, Here’s the URL:

My two-year journey as a data scientist at twitter

Best,
Paras Doshi

PS: If you like articles like this, don’t forget to sign up for the newsletter!

News from PASS Summit’14 for Business Analytics Professionals: #sqlpass #summit14

Standard

This post is a quick summary for all Business Analytics related updates that I saw at PASS Summit’14:

1. Theme of the Keynote(s)/Session(s) seemed to be around educating the community about the benefits of the NEW(er) tools. I saw demos/material for cloud-based tools like SQL databases, Azure stream analytics, Azure DocumentDB, AzureHDInsight & Azure Machine learning. The core message was pretty clear: A data professional does two things – 1) Guards data OR 2) helps to generate Insights from Data – And they will need to keep up-to-date on the new tools to future-proof their career.

Read more about this here: http://blogs.technet.com/b/dataplatforminsider/archive/2014/11/05/microsoft-announces-major-update-to-azure-sql-database-adds-free-tier-to-azure-machine-learning.aspx

2. Coming soon: Power BI will be able to connect to on-premise SSAS data sources (multi-dim & tabular).

3. Coming soon: A better experience to create Power BI dashboards.

Read more about Power BI updates here: http://www.jenunderwood.com/2014/11/05/pass-summit-2014-bi-news/

4. Azure Machine Learning adds a free-tier! You won’t need a credit-card/subscription to sign up for this.

5. I also saw sessions proposing new way of thinking about an architecture for “Self Service BI” and “Big Data” which might be worth following because since these are newer tools, it’s definitely worth considering an architecture that’s designed to make the most of the investments in these new tools. That’s it & I’ll leave you with a quote from James Phillips from Day 1’s keynote:

How does Internet of Things (#IoT) impact data professionals?

Standard

Internet enabled computers to be connected with each other.

Internet enabled Mobile Devices to be connected with each other.

Now, Internet will be used to enable physical things to be connected with each other. This is what is called “Internet of things” (IoT).

So what happens?

since more devices are connected with internet – we will able to generate more data! This is usually good if there’s a business vision around how to make sense of data to increase efficiency of all these things.

Here’s a nice case study from Microsoft (focus on the business case – the things in this case is “elevator” to drive reliability)

 

This is all good news for data professionals! There will be increased demand for professionals who can help businesses make sense of data generated via IoT.

Also beware of the “hype” around this technology. It’s important to take incremental steps to achieve the vision – Instead of trying to analyze data from ALL devices in your organization, start with one physical thing that matter the most for your organization or start with data that you have and take incremental steps to spread data culture in your organization!

Now that Big Data has become a mainstream word in IT and business, we have a new buzzword to learn/talk about IoT – but remember it’s all about making sense of data and your skills would be more valuable than ever!

SQL Server 2014!

Standard

SQL Server 2014 was Announced in Teched’s Keynote!

SQL Server 2014 Teched KeynoteSo while the focus of the SQL server 2012 was around in-memory OLAP, the focus of this new release seems to be In-memory OLTP (Along with Cloud & Big Data)

Here’s the Blog Post: http://blogs.technet.com/b/dataplatforminsider/archive/2013/06/03/sql-server-2014-unlocking-real-time-insights.aspx  (Also Thanks for the Picture!)

 

 

Resource: Introduction to Data Science by Prof Bill Howe, UW

Standard

Introduction to Data Science course taught by Bill Howe just started on coursera platform. Having studied the Data Intensive Computing in Cloud course at UW taught by Prof Bill Howe, I can say that this course would be great resource too!

Check it out: https://www.coursera.org/course/datasci

Introduction to Data Science

Resource: A great tutorial for Hadoop on local windows and Azure.

Standard

Here’s the resource: http://gettingstarted.hadooponazure.com/gettingStarted.html > “HDInsight Jumpstart”

The Tutorial will teach you how to analyze log files using Hadoop Tools like MapReduce, Hive, SQooP – check it out! It works with both HDInsight for local windows as well as Hadoop on Azure:

HDInsight hadoop on windows starting guide tutorial

Conclusion:

I hope this resource helps you get started on building an end-to-end solution with Hadoop on Windows/Azure.