You don't need to actually make the transformation yourself, you can let your Readers and Writers handle it for you. To get the CSV Reader to recognize dates though, you will need to define a schema for your rows. Your schema would look something like this (I've removed the spaces from the column names because they are not allowed):
{
"type": "record",
"name": "ExampleCSV",
"namespace": "Stackoverflow",
"fields": [
{"name": "AdvertiserID", "type": "string"},
{"name": "CampaignStartDate", "type" : {"type": "long", "logicalType" : "timestamp-micros"}},
{"name": "CampaignEndDate", "type" : {"type": "long", "logicalType" : "timestamp-micros"}},
{"name": "CampaignName", "type": "string"}
]
}
To configure the reader, set the following properties:
- Schema Access Strategy = Use 'Schema Text' property
- Schema Text = (Above codeblock)
- Treat First Line as Header = True
- Timestamp Format = "MM/dd/yyyy hh:mm:ss a"
Additionally you can set this property to ignore the Header of the CSV if you don't want to or are unable to change the upstream system to remove the spaces.
- Ignore CSD Header Column Names = True
Then in your CSVRecordSetWriter service you can specify the following:
- Schema Access Strategy = Inherit Record Schema
- Timestamp Format = "yyyy-MM-dd"
You can use UpdateRecord or ConvertRecord (or others as long as they allow you to specify both a reader and a writer)and it will just do the conversion for you. The difference between UpdateRecord and ConvertRecord is that UpdateRecord requires you to specify a user defined property, so if this is the only change you will make, just use ConvertRecord. If you have other transformations, you should use UpdateRecord and make those changes at the same time.
Caveat: This will rewrite the file using the new column names (in my example, ones without spaces) so keep that in mind for downstream usage.