0
votes

I'm building an inverted index and currently getting a null pointer exception in reduce when using context.write. Can anyone spot why? I presume something to do with serialising as I've never done that before? The error also happens when I print out h.

2
it would be great if you can add a stacktrace to your question. - Thomas Jungblut
@ThomasJungblut I've modified the code as you have suggested, other than the setClass method. Where might I override this from? - user4331904
why did you edit to remove the stacktrace and code? - vefthym
@vefthym Seemed the example I gave was irrelevant to the actual cause of the problem. I could put the relevant code in for brevity if preferred? - user4331904

2 Answers

0
votes

Two things that I can spot directly regarding the serialization without a stacktrace:

  1. HMapValue needs a default constructor, it can't be created by Hadoop without one
  2. In the default constructor you need to initialize the ArrayListWritable correctly (not null and it needs the setClass method to deserialize correctly.
0
votes

Turns out it was because I had iterated over the data set twice (the line int df = Iterables.size(values); tricked me). The iterator hadn't reset hence the main block of reduce didnt run and finally I hit a null pointer because I tried accessing my data that hadn't even initialised.