Microsoft announced a cloud based business intelligence platform called Power BI – as a part of that, the project (in public preview) that was previously called “Data Explorer” will be released as “Power Query”. It’s a great tool that have used to find, clean and shape data in Excel 2010, very useful! So one of the first things I checked was whether Excel 2010 can run Power Query or not. Turns out, it does! It works with Excel 2010 professional plus (Please read the system requirements on the official download page for details)
And of course, I downloaded and installed it on my Excel 2010 professional plus.If you’ve not installed Office 2010 SP1 or higher, do that too.
Please note that this change affects some of the blog posts that I’ve published on this blog, Here’s the list:
That’s about it for this post. Update your “Data Explorer” tab to “Power Query” if you haven’t already! It’s a handy tool and I am glad to see that Data Explorer Power Query runs on Excel 2010 Pro Plus!
Data Explorer let’s you “Explore” (search) for web-based public data. This is a great way to combine data that you may have in your data-sources with public data sources for data analysis purposes. Sometimes your data might not tell you the reason behind the observed trends, so when that happens – you can try to see if a public data-set might give you the much-needed context. Let me give you an Example before we start hands-on w/ data explorer so that you have better understanding of importance of public datasets. Here’s a sample that I found here. So, Here’s a demo:
An auto company is seeing sales trends of Hybrid cars and SUV’s from the sales data-sources. But what is the reason behind that? company data does not show that. Someone hypothesizes that it might be because of gas prices. So they test out the hypothesis by combining gas prices information available via public data. And turns out gas prices might be the driving force of sales trends! SEE:
if the gas prices increase, then the sale of SUV go down and the sale of Hybrids go up:
You know that public data can be helpful! So how can you search for public data-sets? Well, You can manually search online, ask someone, browse through public data repositories like azure data market (and other data markets), there’s also a public data search engine! OR you can directly search for them via Data Explorer.
Here are the steps:
1) Excel 2010/2013 > Data Explorer Tab > Online Search > type “Tallest Buildings”
2) I selected one of the data-sets that said “Tallest completed building…. ”
3) Now let’s do some filtering and shaping. Here are the requirements:
– Hide columns: Image, notes & key
– clean columns that has heights data
– Show only city name in location
OK, let’s get to this one by one!
4) Hiding Columns:
Click on Filter & Shape button from the Query Settings:
Select Image Column > Right Click > Hide:
Repeat the steps for notes & key column.
Click on DONE
5) clean column that has heights data.
Click on Filter & Shape to open the query editor
A) let’s rename it. Select column > Right Click > Rename to Height > press ENTER
B) let’s remove the values in brackets. Select Column > right click > split column > By delimiter > At each occurrence of the delimiter > Custom and enter “(” > OK
This should transform the data like this:
Hide height.2 and rename the height.1 to height
Click on DONE
6) Let’s just have city names in the location column
click on Filter & shape to load query editor:
A) select location > right click > split column > by delimiter > Custom – Enter: ° in the text box like this:
Now, that requires formatting! I don’t want to do that especially if I am doing that for few more tables – let’s see an elegant way of going about web scraping tables using Data Explorer add-in:
Step 1:
Keep the URL handy.
Now, Open Excel 2010/2013 > switch to Data Explorer tab > click on From Web
Step 2:
Paste the URL that has the tables you need:
Step 3:
The dialog box would list all the tables from that HTML page and so you’ll need to select the table that you want.
(optional) if your tables have headers as first rows. Make sure to mark them as headers: Right Click a Column > use First Row as headers
Step 4:
Click DONE and your excel sheet will populate itself w/ the data from the table.
That’s about it for the steps!
Notes:
1) Data Explorer add-in will let you “explore” external open datasets that’s out there on the internet.
2) Please make sure that you’re not violating any copyrights before you go about web scraping and sharing your work.
Data Explorer add-in is amazing! It’s helps you: combine, find and re-shape your data in Excel 2010/2013. I’ve blogged about: 1) How to merge Table Data and 2) How to clean duplicate data and now in this blog post, I want to share a step-by-step on Unpivoting data using the Data Explorer add-in.
What is un-pivoting? I hear you ask. Instead of explaining it, let me share an Image:
BTW, the above data is from my Facebook Page Insights.
So our problem statement is (please refer to above Image): we are given table blue and we need to output table green. In other words, we need to Unpivot the data.
Solution:
Here are the steps:
1) Open Excel, Open Data Explorer add-in. And Connect to your data. Wait when you see the Query Editor.
2) (Optional) In the Query Editor, Rename the query. I renamed it to “Unpivot Data”. And this how my query editor looks:
3) Now, Select the columns that need to be unpivoted > Right Click > Unpivot Column
Note that I’ve selected all columns that I want to UnPivot:
4) You’ll see the updated results in the query editor window. I renamed the columns “Attribute” to “Age and Gender” and “value” to “reach”. If you want to rename the columns, select the column > Right click > rename.
If everything looks OK, click on Done in the bottom right corner
5) There you have it, Unpivoted data in Excel 2010/2013 using Data Explorer add-in!
And then its super easy to create charts, Here’s one I created after I had unpivoted the data:
Insight: For my blog, my Target Audience seems to Male between the age of 18-24 and then 25-34.
FYI: The Date Range of the Data Set of 1st Jan 2013 – 25th Apr 2013.
That’s about it for this post, Here are some Related articles:
Merging/Joining/Combining data-sets in Excel has not been an easy task. There are third-party add-ins that makes it easy but out of the box, excel didn’t have an easy way to merge/join table data. But now with the Data Explorer add-in, we have an add-in that let’s us merge/join data in excel w/ few clicks.
Input is Table 1 & Table 2. The output we need is merged Table.
Table 1:
Date
Daily New number
Month
1/1/2012
0
1
1/2/2012
0
1
1/3/2012
0
1
1/4/2012
0
1
1/5/2012
0
1
1/6/2012
0
1
1/7/2012
0
1
1/8/2012
0
1
1/9/2012
0
1
………………………………..
Table 2:
Month
Month Name
1
January
2
February
3
March
4
April
5
May
6
June
7
July
Merged Table:
Date
Daily New number
Month
Month Name
1/1/2012
0
1
January
1/2/2012
0
1
January
1/3/2012
0
1
January
1/4/2012
0
1
January
1/5/2012
0
1
January
1/6/2012
0
1
January
1/7/2012
0
1
January
1/8/2012
0
1
January
1/9/2012
0
1
January
………………………………
Solution:
Let’s see how data explorer can help us Join/Merge Table 1 & Table 2.
1) create query that connects to Table 1 & Table 2.
2) Once you have queries that connect to the tables need to be merged, then click on Merge
3) Once you click on Merge, you’ll see a dialog:
Here you need to configure three things:
a) First Table
b) Second Table
c) Columns that will be used to merge/join data
In this case, this is how my merge dialog looks:
4) Once configured correctly, click on OK. You’ll see a dialog box where you can configure the output of the merged table. click on the new column to see the options that are available to you to configure the output of the merged table:
5) In this case, I’ve selected just one column month name that needs to be merged. You can also explore the aggregate tab in case you’ve numbers that needs merging.
6) This is how the output looks:
7) Rename the new column.
Select the new column > Right Click > Rename
8) Click Done if it looks OK.
9) The merged data is now available to you in Excel!
And one can analyze it!
Let’s see before and after. Note that instead of month numbers, we now have month names
Conclusion:
In this post, we saw how to merge/join/combine data from two different sources in Excel 2010.
In this blog post, we’ll see how you can remove duplicated and clean data in excel tables using Data Explorer Add-in.
Problem:
Our Excel Table has following Data:
Month
Month Name
1
January
1
January
1
January
2
February
2
February
3
March
And we want to remove duplicates to make the data-set look like this:
Month
Month Name
1
January
2
February
3
March
In real world data-sets, we wouldn’t have few rows but lot’s of rows and doing it manually wouldn’t be the wisest option. With that in mind, let’s look for a few-clicks solution that can help us remove duplicates.
Solution:
If you haven’t already, download the Data Explorer add-in preview available for Excel 2010 & 2013. It can do a lot more than removing duplicates – it’s a great add-in and it’ll save you lots of time especially if your job involves discovering, cleaning and combining data for analysis purposes. After you’re done installing the add-in, use the steps below to remove duplicates in an excel column:
1. Open Data in Excel. Switch to Data Explorer Tab
2. For the purpose of the demo, I am assuming that you already have the data in excel file. If not, you can connect to other sources via the add-in.
3. Data Explorer add-in > Excel Data> From Table
4. After you’ve clicked on the From Table, a query editor will pop up:
5. Select both columns
(you can select both columns by: select first column > hold down the ctrl key and then click on second column)
6. Right click > Remove Duplicates
7. click on done if you see that the duplicates have been removed correctly
Conclusion:
In this blog post, we saw how to remove duplicates and clean data in Excel using the Data Explorer Preview add-in.
In this post, I’ll summarize the PASS Business Analytics Conference’s Keynote Day #1:
The structure of the Keynote:
One of the NEW challenges that Data Pros face today is complexity involved in building a BI solution. Following slides nicely represent the challenge from the Tools standpoint: