1
votes

I have to implement a Graph algorithm using Map Reduce. For this I have to chain jobs.
MAP1 -> REDUCE1 -> MAP2 -> REDUCE2 -> ...
I will be reading the adjacent matrix from file in MAP1 and creating a user defined java class Node that will contain the data and the child informations. I want to pass this information to MAP2.
But, in the REDUCE1 when I write

context.write(node, NullWritable.get());

the node data gets saved in a file as a text format using the toString() of the Node class.
When the MAP2 tries to read this Node information,

public void map(LongWritable key, Node node, Context context) throws IOException, InterruptedException

it says that it cannot convert the text in the file to Node.
I am not sure what is the right approach for this type of Chaining of jobs in Map reduce.

The REDUCE1 writes the Node in this format:

Node [nodeId=1, adjacentNodes=[Node [nodeId=2, adjacentNodes=[]], Node [nodeId=2, adjacentNodes=[]]]]

Actual exception:

java.lang.Exception: java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to custom.node.nauty.Node

1
You have to define your own class (Node) that extends WritableComparable (Comparable because it is the key). Then, set the outputKeyClass to be Node.class, instead of TextWritable.classvefthym
I have done that. Your approach is appropriate when sending data from Map to Reduce. However, it seems the Reduce can only write to file. While chaining the 2nd Map only reads from file that the 1st reduce has written. The 1st reduce cannot write Node(serialized) to a file.Dip
how do you declare the reducer class and the reduce method? why do use the node as a key in the reducer1 and as a value in the map2? you should use sequencefileinputformat in mapper2 and sequencefileoutputformat in reducer1, and not textinputformat and textoutputformat, respectively.vefthym

1 Answers

1
votes

Based on the comments, the suggested changes that will make your code work are the following:

You should use SequenceFileInputFormat in mapper2 and SequenceFileOutputFormat in reducer1, and not TextInputFormat and TextOutputFormat, respectively. TextInputFormat reads a LongWritable key and a Text value, which is why you get this error.

Accordingly, you should also change the declaration of mapper two, to accept a Node key and a NullWritable value.

Make sure that the Node class extends the Writable class (or the WritableComparable if you use it as a key). Then, set the outputKeyClass of the first job to be Node.class, instead of TextWritable.class.