I'm trying to read an JSON file into my hadoop mapreduce algorithm. How can i do this? I've put a file 'testinput.json' into /input in my HDFS memory.
When calling the mapreduce i execute hadoop jar popularityMR2.jar populariy input output
, with input stating the input directory in the dhfs memory.
public static class PopularityMapper extends Mapper<Object, Text, Text, Text>{
protected void map(Object key, Text value,
Context context)
throws IOException, InterruptedException {
JSONParser jsonParser = new JSONParser();
try {
JSONObject jsonobject = (JSONObject) jsonParser.parse(new FileReader("hdfs://input/testinput.json"));
JSONArray jsonArray = (JSONArray) jsonobject.get("votes");
Iterator<JSONObject> iterator = jsonArray.iterator();
while(iterator.hasNext()) {
JSONObject obj = iterator.next();
String song_id_rave_id = (String) obj.get("song_ID") + "," + (String) obj.get("rave_ID")+ ",";
String preference = (String) obj.get("preference");
System.out.println(song_id_rave_id + "||" + preference);
context.write(new Text(song_id_rave_id), new Text(preference));
}
}catch(ParseException e) {
e.printStackTrace();
}
}
}
My mapper function now looks like this. I read the file from the dhfs memory. But it always returns an error, file not found.
Does someone know how i can read this json into a jsonobject?
Thanks