PutHBaseJSon Processor in Apache-Nifi

Question

I am using PutHBaseJSon processor that will fetch data from hdfs location and to put it into hbase.The data present in hdfs location is like below format and this is in a single file.

{"EMPID": "17", "EMPNAME": "b17", "DEPTID": "DNA"}            
{"EMPID": "18", "EMPNAME": "b18", "DEPTID": "DNA"}
{"EMPID": "19", "EMPNAME": "b19", "DEPTID": "DNA"}

when I execute the PutHBaseJSon processor, it's only fetching the first row and put it into the hbase table which I have created. Can't we able to fetch all the rows present in that file using this processor? or How to fetch all the records from the single file to hbase?

Bryan Bende Bryan Bende · Accepted Answer · 2016-06-14T11:40:05

PutHBaseJSON takes a single JSON document as input. After fetching from HDFS, you should be able to use the SplitText processor with a line count of 1 to get each of your JSON documents into a single flow file.

If you have millions of JSON records in a single HDFS file, then you should perform a two phase split, the first SplitText should split with a line count of say 10,000 then then a second SplitText should split those down to 1 line each.

PutHBaseJSon Processor in Apache-Nifi

2 Answers