1
votes

am trying to import data from mysql to hdfs/hive using Nifi facing some challenges. please suggest me.

  1. QueryDatabaseTable ---mysql data
  2. ConvertAvroToJson --- output
[{"emp_id": 467260, "emp_name": "Rob", "emp_age": 32},
{"emp_id": 467261, "emp_name": "Vijay", "emp_age": 32},
{"emp_id": 467258, "emp_name": "Jayaprakash", "emp_age": 26},
{"emp_id": 467259, "emp_name": "Kalyan", "emp_age": 32},
{"emp_id": 467262, "emp_name": "Andy", "emp_age": 20},
{"emp_id": 467263, "emp_name": "Ashley", "emp_age": 24},
{"emp_id": 467264, "emp_name": "Mounika", "emp_age": 24}]
  1. splitjson -- how to split json file into single flow files
2

2 Answers

2
votes

As James said, in SplitJson, you may want is $, or you can try $.*

As an alternative, you can try QueryDatabaseTable -> SplitAvro -> ConvertAvroToJson, this will split the Avro records first instead of converting the whole set to JSON then splitting the JSON.

In Apache NiFi 1.0.0, there will be a ConvertAvroToORC processor which will allow you to convert directly to ORC, then you can use PutHDFS and PutHiveQL (in NiFi 0.7.0 and 1.0.0) to transfer the files to HDFS and create a Hive table atop the target directory to make the data ready for querying.

0
votes

I believe the JsonPath Expression to split those records is just $, because the array of records is the root object.