0
votes

I have a ConvertJsontoAvro processor in NiFi 1.4 and am having difficulty getting the proper datatype of decimal within the avro. The data is being transformed into bytes using logical Avro data types within ExecuteSQL processor, converting avro to Json using ConvertAvrotoJSON processor, and then using ConvertJsonToAvro processor to put into HDFS using PutParquet.

My schema is :

{ "type" : "record", "name" : "schema", "fields" : [ { "name" : "entryDate", "type" : [ "null", { "type" : "long", "logicalType" : "timestamp-micros" } ], "default" : null }, { "name" : "points", "type" : [ "null", { "type" : "bytes", "logicalType" : "decimal", "precision" : 18, "scale" : 6 } ], "default" : null }] }

My JSON:

{ "entryDate" : 2018-01-26T13:48:22.087, "points" : 6.000000 }

I get an error for the avro saying

Cannont convert field points: Cannot resolve union : {"bytes": "+|Ð" not in ["null", {"type":"bytes","logicalType":"decimal","precision":18,"scale":6}]"

Is there some type of work around for this?...

1
If the end goal is to use PutParquet, why do you need to go from Avro to Json back to Avro? can't you send the Avro from ExecuteSQL directly to PutParquet? - Bryan Bende
It also might be worth using ConvertRecord instead of ConvertAvroToJson and ConvertJsonToAvro, those are the older style conversions and ConvertRecord is the new approach. - Bryan Bende

1 Answers

-1
votes

Currently you cannot mix null type and logical types due to bug in Avro. Check this still unresolved issue: https://issues.apache.org/jira/browse/AVRO-1891

Also the defaults value cannot be null. This should work for you:

{
    "type" : "record",
    "name" : "schema",
    "fields" : [ {
      "name" : "entryDate",
      "type" : {
        "type" : "long",
        "logicalType" : "timestamp-micros"
      },
      "default" : 0
    }, {
      "name" : "points",
      "type" : {
        "type" : "bytes",
        "logicalType" : "decimal",
        "precision" : 18,
        "scale" : 6
      },
      "default" : ""
    }]
  }