Simplified Data Processing on Large Clusters
Jeffrey Dean and Sanjay Ghemawat had written a very interesting paper highlighting their research in Google for Simplified Data Processing on Large Clusters. Their implementation called MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key. ...