In this blog-post, we would visualize how MapReduce Algorithms operates to perform a Word Count on a Text Input:
First of all, for all programmers out there, Here is the code (Javascript):
[sourcecode language=”javascript”]
var map = function (key, value, context) {
var words = value.split(/[^a-zA-Z]/);
for (var i = 0; i < words.length; i++) {
if (words[i] !== "") {
context.write(words[i].toLowerCase(), 1);
}
}
};
var reduce = function (key, values, context) {
var sum = 0;
while (values.hasNext()) {
sum += parseInt(values.next());
}
context.write(key, sum);
};
[/sourcecode]
Courtesy: Microsoft Hadoop on Azure Samples
Now, let’s visualize this using an example.
Suppose the Text is “Hadoop on Azure sample Hadoop is on Windows Azure Hadoop is on Windows server” – Then this is how you can think of what happens to your input when it is processed first by Map function and then by Reduce function:
INPUT | MAP | REDUCE | ||
Hadoop on Azure sample Hadoop is on Windows Azure Hadoop is on Windows server | Hadoop | 1 | Hadoop | 3 |
On | 1 | |||
Azure | 1 | on | 3 | |
Sample | 1 | |||
Hadoop | 1 | Azure | 2 | |
Is | 1 | |||
On | 1 | Sample | 1 | |
Windows | 1 | |||
Azure | 1 | Is | 2 | |
Hadoop | 1 | |||
Is | 1 | Windows | 2 | |
On | 1 | |||
Windows | 1 | Server | 1 | |
Server | 1 |
Conclusion:
In this blog post, we visualized how MapReduce Algorithm operates for a WordCount Example.
0 thoughts on “Visualizing MapReduce Algorithm with WordCount Example:”