Extreme Computing (Hadoop Map Reduce) 4

This one will continue the discussion for the second assignment of Extreme Computing in the University of Edinburgh Task 4 For this task you will use a dataset from StackOverflow and extract specific pieces of information. Initially, you should understand the format of the dataset, next you will need to do parse each post, and… Continue reading Extreme Computing (Hadoop Map Reduce) 4

Extreme Computing (Hadoop Map Reduce) 3

This blog will be the second assignment of Extreme Computing from the University of Edinburgh. Extreme Computing Second assignment Based on an Assignment by Michail Basios and Stratis Viglas In this assignment, you will address real-world cases where MapReduce can be used. Initially, you will deal with a problem related to information retrieval: you will… Continue reading Extreme Computing (Hadoop Map Reduce) 3

Extreme Computing (Hadoop Map Reduce) 2

In this part, we will still discuss the remaining problems solved in the coursework of Extreme Computing in the University of Edinburgh.  If you would like to see the whole introduction, please go to Part 1. Task 5 Create a version of the two-word counting program that uses a combiner. Is it faster? Mapper (Python)… Continue reading Extreme Computing (Hadoop Map Reduce) 2

Extreme Computing (Hadoop Map Reduce) 1

Now our network generates more and more data as time goes by. Computer scientists are trying to understand the pattern under all of the raw data. Hadoop could be used to clean data and send then results to the pipeline, or do some SQL style manipulations or many more advanced tasks on different types of data. Today… Continue reading Extreme Computing (Hadoop Map Reduce) 1