During an ETL process I have this one SAS date field that is in a 5 digit integer format, which indicates days since 01-01-1960. I order to make this data column more useful in analysis I would like to convert the column to a date data type field in Redshift.
Currently I am trying to do this in pyspark as follows:
created new column "sas_date" with string literal "1960-01-01"
Using pyspark.sql.function.date_add I pass the "sas-date" column as the start date parameter and the integer value 'arrival_date' column as the second parameter.
When the date_add function runs I get error Column not iterable, even though I would think the arrival_date column being a series would mean it was iterable. But its not, why?
When I remove the 'arrival_date' column and replace it with a static integer value (say 1) the date_add function will work.
i94 = i94.withColumn('arrival_date', col('arrival_date').cast(Int()))
i94 = i94.withColumn('sas_date', lit("1960-01-01"))
i94 = i94.withColumn('arrival_date', date_add(col('sas_date'), i94['arrival_date']))
I want to be able to pass my column so that the second date_add parameter will be dynamic. However it seems date_add does not accept this? If date_addd does not accomplish this what other option do I have outside of using a UDF?
UPDATE: State of data right before the date_add() operation
i94.printSchema()
root
|-- cic_id: double (nullable = true)
|-- visa_id: string (nullable = true)
|-- port_id: string (nullable = true)
|-- airline_id: string (nullable = true)
|-- cit_id: double (nullable = true)
|-- res_id: double (nullable = true)
|-- year: double (nullable = true)
|-- month: double (nullable = true)
|-- age: double (nullable = true)
|-- gender: string (nullable = true)
|-- arrival_date: integer (nullable = true)
|-- depart_date: double (nullable = true)
|-- date_begin: string (nullable = true)
|-- date_end: string (nullable = true)
|-- sas_date: string (nullable = false)
i94.limit(10).toPandas()
arrival_date
twice. Can you show data like what is arrival_date value at first place. – SMaZ