1
votes

I am using Nifi to load CSVs, apply a NEW schema and load them into a SQl db. Currently I am writting an Avro schema, and applying the schema to each CSV. I am writing the schema based on the order of the incoming CSV- the first field = first column in CSV. Is there a way to map one schema to another based on column name? I.e. can I say 'csv.name -> sql.username'.

I know this can be done manually before uploading the csvs, I am wondering if there is a way within Nifi to map a schema to data based on the datas current schema, not knowing the order of the current schema, just the fields.

I have read about recordpaths and update records. I am looking for something to match the whole incoming schema to a new schema, not based on order.

Avro Schema Settings:

AvroSchema settings

PutDatabaseRecord settings

PutDatabaseRecord settings

1
If you are loading those files into a DB, the order of the columns usually doesn't matter. What processor are you using to insert them into the DB and what kind of DB do you use? - Ben Yaakobi
"GetFile->PutDatabaseRecord" I am aplying an Avro Schema before PutDataBaseRecord. Using mssql db. I am having trouble with the AvroSchema neededing to be mapped to the data in a specific order, not the data to the db. - ash_huddles
Could you provide the processor in which you apply the Avro shema with, its configuration, and more importantely, its AvroReader/Writer configuration? - Ben Yaakobi
@BenYaakobi I am not having problem with the current configuration, I am just looking to apply the schema without having to follow the current files order. Below is an example schema. I will attach screen shots of the configurations. { "namespace": "nifi", "name": "test", "type": "record", "fields": [ { "name": “field1", "type": ["null","string"] }, { "name": "field2", "type": ["null","string"]}, { "name": "field3", "type": ["null","string"] } ] } - ash_huddles
@BenYaakobi this is a similar question, but with a different application of the schema. stackoverflow.com/questions/45852910/…. The only solution I have found so far is to use execute script and write custome code to map the new schema to the old schema based on column name, not order. - ash_huddles

1 Answers

0
votes

As I see it, you have two options:

Option 1(better one): Add a header line to your records and set Treat First Line as Header to True in your CSVReader

Option 2: Set Schema Access Strategy in your CSVReader to Infer Schema(available since NiFi 1.9.0)

The first one can guarantee a correct mapping of your fields their types.