0
votes

I am building a Nifi flow to get json elements from a kafka and write them into a Have table.

However, there is very little to none documentation about the processors and how to use them.

What I plan to do is the following:

kafka consume --> ReplaceText --> PutHiveQL

enter image description here

Consuming kafka topic is doing great. I receive a json string.

I would like to extract the json data (with replaceText) and put them into the hive table (PutHiveQL).

However, I have absolutely no idea how to do this. Documentation is not helping and there is no precise example of processor usage (or I could not find one).

  • Is my theoretical solution valid ?
  • How to extract json data, build a HQL query and send it to my local hive database ?
2

2 Answers

0
votes

basicly you want to transform your record from kafka into HQL request then send the request to putHiveQl processor.

I am not sur that the transformation kafka record -> putHQL can be done with replacing text ( seam little bit hard/ tricky) . In general i use custom groovy script processor to do this.


Edit

Global overview :

enter image description here

EvaluateJsonPath

This extract the properties timestamp and uuid of my Json flowfile and put them as attribute of the flowfile.

enter image description here

ReplaceText

This set flowfile string to empty string and replaces it by the replacement value property, in which I build the query.

enter image description here

0
votes

You can directly inject the streaming data using Puthivestreaming process. create an ORC table with the strcuture matching to the flow and pass the flow to PUTHIVE3STreaming processor it works.