I've the below column(TriggeredDateTime) in my .avro file which is of type String, i would need to get the data in yyyy-MM-dd HH:mm:ss format(as shown in the expected output) using Spark-Scala. Please could you let me know is there any way to achieve this by writing an UDF, rather than using my below approach. Any help would be much appreciated.
"TriggeredDateTime": {"dateTime":{"date":{"year":2019,"month":5,"day":16},"time":{"hour":4,"minute":56,"second":19,"nano":480389000}},"offset":{"totalSeconds":0}}
expected output
_ _ _ _ _ _ _ _ _ _
|TriggeredDateTime |
|___________________|
|2019-05-16 04:56:19|
|_ _ _ _ _ _ _ _ _ _|
My Approach:
I'm trying to convert .avro file to JSON format by applying the schema and then i can try parsing the JSON to get the required results.
DataFrame Sample Data:
[{"vin":"FU7123456XXXXX","basetime":0,"dtctime":189834,"latitude":36.341587,"longitude":140.327676,"dtcs":[{"fmi":1,"spn":2631,"dtc":"470A01","id":1},{"fmi":0,"spn":0,"dtc":"000000","id":61}],"signals":[{"timestamp":78799,"spn":174,"value":45,"name":"PT"},{"timestamp":12345,"spn":0,"value":10.2,"name":"PT"},{"timestamp":194915,"spn":0,"value":0,"name":"PT"}],"sourceEcu":"MCM","TriggeredDateTime":{"dateTime":{"date":{"year":2019,"month":5,"day":16},"time":{"hour":4,"minute":56,"second":19,"nano":480389000}},"offset":{"totalSeconds":0}}}]
DataFrame PrintSchema:
initialDF.printSchema
root
|-- vin: string (nullable = true)
|-- basetime: string (nullable = true)
|-- dtctime: string (nullable = true)
|-- latitude: string (nullable = true)
|-- longitude: string (nullable = true)
|-- dtcs: string (nullable = true)
|-- signals: string (nullable = true)
|-- sourceEcu: string (nullable = true)
|-- dtcTriggeredDateTime: string (nullable = true)