3
votes

Is it possible to handle null (column does not exist in the CSV) using convertRecord or ant other processor?

I have several CSv files coming in daily. But the problem the fields are not same in each CSv, Is there a way I can send 0 value into the database for all those fields which doesn't exist in the CSV. If yes please let me no how? Thank you.

Ex: CSV1 has following field: Age,Name,Address,Value1,Value2,Value3

(Note:Value3 is not present in all CSV everytime, sometime it is present and sometime the column does not exist)

CSV2 will have following: Age,Name,Address,Value1,Value2

And in database The table skeleton is as follows: Age,Name,Address,Value1,Value2,Value3

I don't want to have nulls followed by some value in the database. Is there a way my flow in NiFi can handle "null" and replace it with 0 before inserting into the database in an automated way (everytime if a specific field given in schema is not present in the CSV the flow should assign 0 to that field and ingest everything else in the database)

Please help me achieve this. Thank you!

1

1 Answers

2
votes

For this case you have to define avro schema with default values in it.

Sample avro schema:

{
"type": "record",
"name": "SQLSchema",
"fields" : [
{"name": "Age", "type": ["null","string"]},
{"name": "Name", "type": ["null","string"]},
{"name": "Address", "type": ["null","int"]},
{"name": "Value1", "type": ["null","int"]},
{"name": "Value2", "type": ["null","int"]},
{"name": "value3", "type": ["null","int"],"default": 0}
]
}

For value3 field we have defined the type as null (or) int and defined default value as 0.

If there is no data for value3 field then it will replace the value as 0.