We use cookies to ensure you have the best browsing experience on our website. Please read our cookie policy for more information about how we use cookies.
- Prepare
- Distributed Systems
- MapReduce Basics
- Relational MapReduce Patterns #1 - Intersections
- Discussions
Relational MapReduce Patterns #1 - Intersections
Relational MapReduce Patterns #1 - Intersections
Sort by
recency
|
8 Discussions
|
Please Login in order to post a comment
import json import sys from collections import OrderedDict class MapReduce: def init(self): self.intermediate = OrderedDict() self.result = [] self.r=0 def emitIntermediate(self, key, value): self.intermediate.setdefault(key, []) self.intermediate[key].append(value)
mapReducer = MapReduce()
def mapper(record): # print record key = record["key"] value = record["value"] # ADD THE REQUIRED LOGIC BELOW # You may need to add some lines for the mapper logic to work # At the end, you need to complete the emit intermediate step if key!=1: if key<=mapReducer.r: mapReducer.emitIntermediate(value,'r' ) else: mapReducer.emitIntermediate(value,'q' ) else: mapReducer.r=int(value.split()[0])
def reducer(key, list_of_values): # ADD THE REQUIRED LOGIC BELOW # You may need to add some lines for the reducer logic to work # At the end, you need to complete the emit step if 'r' in list_of_values and 'q' in list_of_values: mapReducer.emit(key)
if name == 'main': inputData = [] counter = 0 for line in sys.stdin: counter += 1 inputData.append(json.dumps({"key":counter,"value":line})) mapReducer.execute(inputData, mapper, reducer) mapReducer.execute(inputData, mapper, reducer)
Please mention the significance of LinkedHashMap in the template of java, why do we need the insertion order maintained in the inputdata json
Although R and S are mentioned as set, it would still be better to explicitly state that the numbers for each of the collections won't be repeated. Also, line 176: "intermediate.put(key,temp);" (in Java code) is confusing, please remove it from the template. Rest the code is elegant, beautiful and self-explainatory to say the least. Kudos to team Hackerrank for introducing such a challenge. :)
Python2 template is broken: the first line of the input should be interpreted before the mapreduce (as in the other templates), so that the mapper receives the set name as the key instead of the line number..
Think carefully about the keys used for the map; you can easily create intermediate values which already indicate the sets each integer appears in - no need to do this in the reduce.