I recieve binary Avro files from a Kafka topic and I must deserialize them. In the message received by Kafka, I can see a schema at the start of every message. I know it's a better practice to not embed the schema and separate it from the actual Avro file, but I don't have control over the producer and I can't change that.
My code runs on top of Apache Storm. First I create a reader:
mDatumReader = new GenericDatumReader<GenericRecord>();
And later I try to deserialize the message without declaring schema:
Decoder decoder = DecoderFactory.get().binaryDecoder(messageBytes, null);
GenericRecord payload = mDatumReader.read(null, decoder);
But then I get an error when a message arrives:
Caused by: java.lang.NullPointerException: writer cannot be null!
at org.apache.avro.io.ResolvingDecoder.resolve(ResolvingDecoder.java:77) ~[stormjar.jar:?]
at org.apache.avro.io.ResolvingDecoder.<init>(ResolvingDecoder.java:46) ~[stormjar.jar:?]
at org.apache.avro.io.DecoderFactory.resolvingDecoder(DecoderFactory.java:307) ~[stormjar.jar:?]
at org.apache.avro.generic.GenericDatumReader.getResolver(GenericDatumReader.java:122) ~[stormjar.jar:?]
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:137) ~[stormjar.jar:?]
All the answers I've seen are about using other formats, changing the messages delivered to Kafka or something else. I don't have control over those things.
My question is, given a message in bytes[]
with embedded schema inside binary message, how to deserialize that Avro file without declaring schema so I can read it.