SQL Server 2014!


SQL Server 2014 was Announced in Teched’s Keynote!

SQL Server 2014 Teched KeynoteSo while the focus of the SQL server 2012 was around in-memory OLAP, the focus of this new release seems to be In-memory OLTP (Along with Cloud & Big Data)

Here’s the Blog Post: http://blogs.technet.com/b/dataplatforminsider/archive/2013/06/03/sql-server-2014-unlocking-real-time-insights.aspx  (Also Thanks for the Picture!)



Event Recap: SQL Saturday 185 Trinidad!


I was selected to a be a speaker at SQL Saturday Trinidad! And it was amazing because not only did I get a chance to interact with the wonderful people who are part of SQL Server community there but also visited some beautiful places on this Caribbean island!

I visited Trinidad in January, just before their carnival season! And even though, people were busy preparing for carnival season, it was great to see them attend an entire day of SQL Server Training:

SQL Saturday 185 trinidad attendees

And here’s me presenting on “Why Big Data Matters”:

(Thanks Niko for the photo!)

paras presenting on big data

And after the event, I also got a chance to experience the beauty of this Caribbean island!

view trinidad island port of spain

port of spain sql saturday

Thank you SQL Saturday 185 Team for a memorable time!

Presentation Slides: The slides had been posted for the attendees prior to my presentation and if you want you can view them here:


Download PPT: Why Big Data Matters?


Download Link Here:

SQL Saturday 185 (Trinidad): Why Big Data Matters? by Paras Doshi

(if you need the .ppt version of this talk, please contact me via http://parasdoshi.com/contact/)


inner workings of HDFS and MapReduce in a nutshell:


HDFS and MapReduce inner workings in a nutshell.

HDFS MapReduce inner workings

Click on the image to view larger sized image


How to load some data to Hadoop on Windows to get started?


In this post, I want to point out that HDInsight (Hadoop on Windows) comes with a sample datasets (log files) that you can load using the command:

1. Hadoop command Line > Navigate to c:HadoopGettingStarted

2. Execute the following command:

powershell -ExecutionPolicy unrestricted –F importdata.ps1 w3c

import data to hadoop on windows file system

After you have successfully executed the command, you can sample files in /w3c/input folder:

w3c log files iis hadoop on windows

Conclusion: In this post, we saw how to load some data to Hadoop on Windows file system to get started. Your comments are very welcome.

Official Resource: http://gettingstarted.hadooponazure.com/loadingData.html

Hadoop on Windows: How to Browse the Hadoop Filesystem?


This Blog post applies to Microsoft® HDInsight Preview for a windows machine. In this Blog Post, we’ll see how you can browse the HDFS (Hadoop Filesystem)?

1. I am assuming Hadoop Services are working without issues on your machine.

2. Now, Can you see the Hadoop Name Node Status Icon on your desktop? Yes? Great! Open it (via Browser)

3. Here’s what you’ll see:

Hadoop File System Browse

4. Can you see the “Browse the filesystem” link? click on it. You’ll see:

hadoop file system name node status windows

5. I’ve used the /user/data lately, so Let me browse to see what’s inside this directory:

user data hadoop sqoop hive mapreduce

6. You can also type in the location in the check box that says Goto

7. If you’re on command line, you can do so via the command:

hadoop fs -ls /

hadoop command line list all files system

And if you want to browse files inside a particular directory:

hadoop command line sqoop mapreduce hdfs file system

Official Resource:

HDFS File System Shell Guide


In this post, we saw how to browse Hadoop File system via Hadoop Command Line & Hadoop Name Node Status

Related Articles:

Visualizing MapReduce Algorithm with an Example: Finding Max Temperature


Problem Statement: Find Maximum Temperature for a city from the Input data.

Step 1) Input Files:

File 1:

New-york, 25

Seattle, 21

New-york, 28

Dallas, 35

File 2:

New-york, 20

Seattle, 21

Seattle, 22

Dallas, 23

File 3:

New-york, 31

Seattle, 33

Dallas, 30

Dallas, 19

Step 2: Map Function

Let’s say Map1, Map2 & Map3 run on File1, File2 & File3 in parallel, Here is their output:

(Note how it outputs the “Key – Value” pair. The key would be used by the reduce function later to do a “group by“)

Map 1:

Seattle, 21

New-york, 28

Dallas, 35

Map 2:

New-york, 20

Seattle, 22

Dallas, 23

Map 3:

New-york, 31

Seattle, 33

Dallas, 30

Step 3: Reduce Function

Reduce Function takes the input from Map1, Map2 & Map3, to give an output:

New-york, 31

Seattle, 33

Dallas, 35


In this post, we visualized MapReduce Programming Model with an example: Finding Max Temp. for a city.  And as you can imagine you can extend this post, to visualize:

1) Find Minimum Temperature for a city.

2) In this post, the key was City, But you could substitute it by other relevant real world entity to solve similar looking problems.

I hope this helps.

Related Articles:

Visualizing MapReduce Algorithm with WordCount Example